Streamline Your Analytics: A Guide to Reusable Window Definitions in RisingWave

RisingWave LabsRisingWave Labs
4 min read

Window functions are one of the most powerful tools in a data engineer's SQL toolkit, especially in stream processing. They allow you to perform complex calculations across a set of related rows—like calculating a moving average, ranking items within a category, or finding the cumulative sum of sales.

However, as your analytics become more sophisticated, your queries can become unwieldy.

The pain of repetition

Imagine you're analyzing employee data. You want to calculate not just the average salary per department, but also the total salary, the highest salary, and each employee's rank within their department.

Each of these calculations uses the exact same window definition: partition the data by department and order it by salary. Your query would look something like this:

-- The "Before" Query
SELECT
  depname,
  empno,
  salary,
  avg(salary) OVER (PARTITION BY depname ORDER BY salary DESC),
  sum(salary) OVER (PARTITION BY depname ORDER BY salary DESC),
  rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;

Look at that OVER clause. It’s copied and pasted three times. While this works, it introduces a few significant problems:

  • It's verbose and hard to read: The repetitive logic clutters your SELECT list, making it difficult to see at a glance what the query is actually producing.

  • It's error-prone: What if you need to change the windowing logic? Perhaps you decide to add a secondary sorting key. You’d have to find and replace every single instance of the OVER clause. Miss one, and you've introduced a subtle, hard-to-debug bug where different aggregations are calculated over different windows.

  • It's a maintenance nightmare: As queries grow, this copy-paste pattern becomes a technical debt that makes future updates slow and risky.

There has to be a better way. And now, there is.

Introducing named windows

We're excited to introduce Named Windows in RisingWave v2.5.0, a new feature designed to help you write cleaner, safer, and more maintainable SQL.

The concept is simple but powerful: Define your window logic once, give it a name, and reuse it as many times as you need.

Instead of repeating the window definition, you can use the WINDOW clause at the end of your query to create a named window. Here’s how the previous query looks now:

-- The "After" Query with a Named Window
SELECT
  depname,
  empno,
  salary,
  avg(salary) OVER w,
  sum(salary) OVER w,
  rank() OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);

The difference is night and day. The SELECT list is clean and focused on the output, while the windowing logic is neatly defined in a single, authoritative place.

How to use named windows

Getting started with named windows is straightforward. The syntax consists of two parts:

  1. The WINDOW clause: Define your named window at the end of your query, after the FROM or WHERE clause.

     WINDOW window_name AS ( [ PARTITION BY ... ] [ ORDER BY ... ] [ frame_clause ] )
    
  2. The OVER reference: In your SELECT list, reference the named window in your function calls.

     SELECT window_function_name ( ... ) OVER window_name FROM ...
    

Note: With the introduction of this feature, WINDOW is now a reserved keyword in RisingWave. If you were previously using WINDOW as a table or CTE alias, you will need to rename it.

A foundation for smarter analytics

Adopting named windows is a strategic move towards more resilient and insightful analytics. It fundamentally improves your workflow by solving three distinct challenges:

  • Boost clarity and maintainability: By separating the window's definition from its use, queries become self-documenting and radically easier to read. Future updates are simplified from a risky, multi-line edit to a single, straightforward change.

  • Guarantee consistency and eliminate errors: Centralizing the window logic in one place eradicates the risk of subtle bugs caused by copy-paste errors, where different calculations are accidentally run over slightly different data partitions. Your logic is correct by design.

  • Accelerate complex analysis: Effortlessly layer multiple window functions—for ranking, period-over-period comparisons, and moving averages—over the exact same data set. This allows you to build sophisticated analytics faster and with a fraction of the code.

Get started today!

Named windows are a testament to our commitment to making powerful stream processing accessible and efficient. It’s a feature that encourages best practices like the DRY (Don't Repeat Yourself) principle, leading to SQL that is not only more powerful but also a pleasure to write and maintain.

This feature is available now in RisingWave v2.5.0. We invite you to upgrade and give it a try.

We're excited to see how you'll use named windows to uncover new insights from your streaming data!

0
Subscribe to my newsletter

Read articles from RisingWave Labs directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

RisingWave Labs
RisingWave Labs

RisingWave is an open-source distributed SQL database for stream processing. It is designed to reduce the complexity and cost of building real-time applications. RisingWave offers users a PostgreSQL-like experience specifically tailored for distributed stream processing. Learn more: https://risingwave.com/github. RisingWave Cloud is a fully managed cloud service that encompasses the entire functionality of RisingWave. By leveraging RisingWave Cloud, users can effortlessly engage in cloud-based stream processing, free from the challenges associated with deploying and maintaining their own infrastructure. Learn more: https://risingwave.cloud/. Talk to us: https://risingwave.com/slack.