A quick overview of BigQuery functions

When we talk about functions in BigQuery, we're referring to several distinct capabilities.

Beyond the standard built-in functions like CURRENT_TIMESTAMP() or LENGTH(), BigQuery helps users to define custom functions that extend SQL capabilities. These user-defined functions belong to the larger category of routines (alongside stored procedures), enabling logic reuse.

🔹 Types of Functions in BigQuery

By Duration

➡️ Persistent functions – Stored in your dataset and reusable across all sessions

➡️ Temporary functions – Available only within your current session (created with TEMP keyword)

By Return Type

➡️ Scalar functions – Return a single value per input row (which can be complex types like structs or arrays) → typically used in SELECT or WHERE clauses

➡️ Table-Valued Functions (TVFs) – Return entire tables, requiring you to SELECT FROM the function

BigQuery functions can be written in either SQL or JavaScript.

Based on their processing nature:

➡️ Regular UDFs – Process individual rows, transforming inputs into a single output value

➡️ User-Defined Aggregate Functions (UDAFs) – Combine multiple rows into a single result using custom logic (currently in preview)

🔹 Beyond BigQuery: Remote Functions

For complex processing requirements, BigQuery offers remote functions, which allows us to:

➡️ Send data to Google Cloud Functions or other external services

➡️ Process it using a programming language

➡️ Return results to our query

This opens access to the vast ecosystem of libraries in languages like Python.

Their Place in Modern SQL

Back in the day when I just started with SQL Server, I used scalar functions sparingly (as a junior I was always warned about performance 🤓) and occasionally employed TVFs for small reusable datasets.

Today, with modern transformation frameworks like dbt and Dataform, I find myself almost not using BigQuery — the same reusable logic is now defined as macros or custom JS functions within these frameworks.

💡 I'm curious:

➡️ How often do you use UDFs or TVFs in your SQL environment?

➡️ Do you prefer handling reusable logic in your SQL code or in external frameworks?

➡️ Any interesting use cases you've seen for remote functions for unusual/specialized processing needs?

Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.

0
Subscribe to my newsletter

Read articles from Constantin Lungu directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Constantin Lungu
Constantin Lungu

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified