Add Records that Meet the Criteria before Each Group after Grouping — From SQL to SPL #39

Table of contents

Problem Description & Analysis:
In a certain view of the PostgreSQL database, the row_index field is an underscore separated string and also a grouping field. Some groups’ row_index can be divided into 3 parts, while others can be divided into 2 parts.
Task: Now, before each group of records where row_index can be divided into 3 parts, add the group of records where row_index can be divided into 2 parts, and modify row_index to the row_index of each group.
Solution Hightlights:
The records where row_index can be divided into three parts can be grouped by row_index, and then each group of records can be processed by merging the records where row_index can be divided into two parts before each group of records.
But after SQL grouping, it must aggregate immediately, and subsets cannot be kept to continue processing each group of records. This requires a detour to solve, using multi-layer nested window functions to bypass this problem, which is difficult to code.
SPL supports retaining subsets after grouping, allowing for continued processing of each group of records.
A1: Query the database through JDBC.
A2: Retrieve the records where row_index can be divided into two parts.
A3: Remove A2 from the complete data, which means getting records that row_index can be divided into three parts. Group these records while keeping the order unchanged. The symbol \ represents the difference set, and group@u indicates keeping the original order after grouping.
A4: Loop each group of data: Create a new two-dimensional table according to A2, change row_index to the row_index of this group, keep other fields unchanged, and merge the new two-dimensional table with the data of this group. Finally, merge the data of each group. The symbol | represents merging set members.
✅ Get Started with esProc SPL FREE Download — Try It Free!🆓
Subscribe to my newsletter
Read articles from esProc directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

esProc
esProc
esProc SPL is a JVM-based programming language designed for structured data computation, serving as both a data analysis tool and an embedded computing engine. FREE download👉🏻: https://www.esproc.com/download-esproc