Aggregate according to Time Interval — From SQL to SPL #27

esProcesProc
2 min read

Problem description & analysis:

A certain database table stores time-series data, with intervals of several seconds between each record.

Task: Now we need to do a group and aggregation every minute, summarizing data for 5 minutes each time. For example, generate three records in minutes 1, 2, and 3, and summarize the data for minutes 1–5, 2–6, and 3–7 respectively.

Code comparisons:

SQL

SELECT
    [From], DATEADD(MINUTE, 1, [To]) [To], payload
FROM (
    SELECT
        dt, MIN(dt) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) [From],
        dt [To], SUM(payload) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) payload
    FROM (
        SELECT
            DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0) dt, 
SUM(payload) payload
        FROM #tmstmp
        GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
    ) q
) q
WHERE DATEDIFF(MINUTE, [From], [To]) > 3

SQL needs to implement it using nested subqueries and multiple window functions, which makes the code cumbersome.

SPL: SPL provides the syntax for directly accessing positions.

👉🏻 Try.DEMO

A1: Load data.

A2: Group and aggregate by minute.

A3: Generate a new two-dimensional table, where To is taken from the current record, From is taken from 5 minutes after the current record, and the payload summarizes the interval from the current record to the 4th record.

A4: Take from the first item backward to the fifth item.


Free to Try, Powerful to Use — esProc SPL FREE Download.

10
Subscribe to my newsletter

Read articles from esProc directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

esProc
esProc

esProc SPL is a JVM-based programming language designed for structured data computation, serving as both a data analysis tool and an embedded computing engine. FREE download👉🏻: https://www.esproc.com/download-esproc