Google's Query Language Wrapper

LebCitLebCit
5 min read

Some years ago (2021), I've used Google Sheets as a data store and created some useful applications using Apps Script.
Recently, I needed to get the data of a publicly available and editable spreadsheet for a project that I'm working on.
Initially, I considered using Apps Script, but then I discovered a simpler solution:
GViz aka Google Visualization API Query Language!

GViz is well-documented and easy to use for accessing spreadsheets shared with 'anyone who has the link can view' permissions, without requiring credentials, as explained in the authorization section of Ingest Data from Google Sheets.

After reading the documentation, I understood that getting data from a public spreadsheet is a easy as the following:

// Construct the URL for accessing the Google Sheets document data
let url = `https://docs.google.com/spreadsheets/d/${sheetID}/gviz/tq`

// Fetch data from the specified URL
let response = await fetch(url)

// Read the text content from the response
let jsonpResponse = await response.text()

As detailed in the JSON Response Format section, the returned response format defaults to JSONP when no authentication process is detected.
To have an idea of the returned response, take a look at the Examples section.
As we can see, the data we're looking for resides in the table property. Lets extract the JSON from the response and parse it by adding after the previous code the following one:

// Extract the JSON string from the JSONP response
let jsonMatch = jsonpResponse.match(/google.visualization.Query.setResponse\((.*?)\);/)
let jsonString = jsonMatch ? jsonMatch[1] : ""

// Parse the JSON string into a JavaScript object
let jsonObject = JSON.parse(jsonString)

We can now access the table with jsonObject.table or jsonObject["table"]. The table displays the data in 2 separated arrays of objects:

  1. "cols", the columns where each object represents a column with its id, label, pattern and type

  2. "rows", the rows where each object holds the corresponding data

If all we want is to get the data from the spreadsheet, a simple function can do the job like the following one:

function convertTableToObjectArray(table) {
    // Get column labels and types
    const columnLabelsAndTypes = table.cols.map((col) => ({
        label: col.label,
        type: col.type,
    }))

    // Convert table rows to array of objects
    const arrayOfObjects = table.rows.map((row) => {
        const obj = {}
        row.c.forEach((cell, index) => {
            const { label, type } = columnLabelsAndTypes[index]
            // Use formatted value if type is string, otherwise use value
            obj[label] = type === "string" ? cell.v : cell.f
        })
        return obj
    })

    return arrayOfObjects
}

// Access the data inside the `table`
let dataTable = jsonObject.table

// Convert dataTable to an array of objects
const arrayOfObjects = convertTableToObjectArray(dataTable)
console.log(arrayOfObjects)

And that would be it since we now have an array of objects that we can manipulate as we want!

However, GViz provides additional benefits as its description indicates that it enables us to conduct diverse data manipulations using the query on the data source.
In short, we can add some parameters after the URL to modify the returned response.
The available parameters are detailed in the following sections:

  1. Request Formattq and tqx

  2. Creating a Chart from a Separate Spreadsheetheaders, gid and sheet

  3. Query Source Rangesrange

The tq parameter (table query) uses the Language Clauses as an encoded string to deliver the data corresponding to the query.
The tqx parameter is a set of colon-delimited key/value pairs to get the response in a particular format (out), with or without an optional JS callback function (responseHandler), and optionally export it into a specified filename (outFileName).
The headers parameter specifies how many rows are header rows.
The gid (id number of the sheet) and sheet (the name of the sheet) define the sheet we want to work on in a multi-sheet document, we should only use one of them and not both.
The range parameter specifies what part of a spreadsheet to use in the query, going from a single cell to the whole spreadsheet.

Considering that the data could be returned in a CSV Response Format, I pursued an alternative approach, as my objective was not solely to retrieve the data but also to extract a subset based on specific conditions.
So instead of getting the data then extract the needed part of it with JS, I used GViz powers to do it in three steps:

  1. Get the data in csv format
    https://docs.google.com/spreadsheets/d/${sheetID}/gviz/tq?tqx=out:csv

  2. Add parameters to extract only the desired part of the data
    headers, gid or sheet, range and finally tq

  3. Parse the CSV data into an array of objects so it can be easily accessible and manipulated

Looking at the table provided as an example under the Language Syntax Overview and paying attention to the fact that column IDs in spreadsheets are always letters, as explained in Setting the Query in the Data Source URL, suppose that we want to run the following query:

  1. Get the data from the sheet named Employees_Data (assuming a multi-sheet document)

  2. Work only on the range from A1 to D6 (name to salary and 5 rows)

  3. Return only the data in the name's column (A)

  4. Where the name starts with 'Da'

  5. Arrange the data by ascending order of salary

The URL for this request would be something like:

let url = `https://docs.google.com/spreadsheets/d/${sheetID}/gviz/tq?tqx=out:csv&headers=1&sheet=Employees_Data&range=A1:D6&tq=SELECT%20A%20WHERE%20A%20starts%20with%20'Da'%20ORDER%20BY%20D%20asc`

where ${sheetID} consists of 44 characters. It's a combination of letters, numbers, and special characters, such as hyphens and underscores.
As the query complexity increases, it becomes progressively challenging to write and comprehend, making it visually unappealing.
That's why I developed GoogleQLWrapper, a JavaScript class that retrieves data from a public spreadsheet, accommodating various parameters based on the document's structure and the user's requirements.
GoogleQLWrapper provides two asynchronous methods:

  1. getSheetData (sheetID, queryOptions = {})

  2. getSourceData (sheetID, sheetName)

You can find more documentation about it on the GoogleQLWrapper GitHub repository.

I hope that you'll find it useful.

0
Subscribe to my newsletter

Read articles from LebCit directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

LebCit
LebCit

I'm LebCit, a Citizen of a small country called Lebanon in the Middle East. I love to read a lot, learn as much as I can, and of course apply and share with others.