Find Consecutive Alphabetical Characters in String — From SQL to SPL #28


Problem description & analysis:
The field of a certain database table is strings.
Task: Now we need to find the strings that contain at least 5 consecutive letters sorted alphabetically, or in other words, the longest substring in a continuous ascending order with a length greater than or equal to 5. Note to exclude non-letter characters.
Code comparisons:
SQL:
SELECT value
FROM table_name t
CROSS JOIN LATERAL (
SELECT MIN(lvl) AS start_pos
FROM (
SELECT LEVEL AS lvl,
SUBSTR(t.value, LEVEL, 1) AS ch
FROM DUAL
CONNECT BY LEVEL <= LENGTH(t.value)
)
MATCH_RECOGNIZE(
ORDER BY lvl
MEASURES
FIRST(lvl) AS lvl
PATTERN (first_row consecutive{3,})
DEFINE first_row AS (ch BETWEEN 'A' AND 'Z' OR ch BETWEEN 'a' AND 'z'),
consecutive AS (ch BETWEEN 'A' AND 'Z' OR ch BETWEEN 'a' AND 'z')
AND ASCII(PREV(ch)) + 1 = ASCII(ch)
)
HAVING MIN(lvl) > 0
)
SQL needs to use nested subqueries and CONNECT BY to simulate sequences, and then use MATCH_RECOGNIZE to process sequences. The code is cumbersome and difficult to understand.
SPL: SPL directly provides sequence calculation functions:
👉🏻 Try.DEMO
A1: Load data.
A2: Split the string into a sequence by character, group the sequence, and start a new group when the current member is less than or equal to the previous member or the previous member is not a letter. Select the string with the longest group that has a length greater than or equal to 5. The function group can keep the grouped subsets for subsequent processing after grouping, rather than immediately aggregating them. @i represents grouping according to conditions.
Get Started with esProc SPL — esProc SPL FREE Download.
Subscribe to my newsletter
Read articles from esProc directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

esProc
esProc
esProc SPL is a JVM-based programming language designed for structured data computation, serving as both a data analysis tool and an embedded computing engine. FREE download👉🏻: https://www.esproc.com/download-esproc