SQL Problem of the day #7
Problem Statement: Replace the null value with the previous value.
NOTE: null values are one thing that you have to deal with in your career right from the day you start working. So, better be equipped.
Table creation and data insertion script:
CREATE TABLE mom_2004 (
match_no int,
mom_player varchar(50),
match_opponent varchar(50)
);
INSERT INTO mom_2004 VALUES
(1, 'Virender Sehwag', 'South Africa'),
(2, 'Sourav Ganguly', 'South Africa'),
(3, NULL, 'South Africa'),
(4, 'Sachin Tendulkar', 'England'),
(5, 'Sourav Ganguly', 'England'),
(6, 'Sachin Tendulkar', 'England'),
(7, NULL, 'England'),
(8, NULL, 'Bangladesh'),
(9, 'Rahul Dravid', 'Australia'),
(10, NULL, 'Australia');
use this free website https://sqliteonline.com/ to create tables and run queries on them.
Table view:
Solution
Make sure to try it on your own before jumping onto the solution.
Hint #1: The first thing that should strike us is using a lag function on the mom_player column. lag function essentially creates a new column with lags its parent column by 'n' values.
Learn more about 'em here: https://learnsql.com/blog/lead-and-lag-functions-in-sql/
we can use the coalesce function along with a correlated subquery to perform the job here!
SELECT
match_no,
COALESCE(
mom_player,
(SELECT top 1 mom_player FROM mom_2004 AS prev WHERE prev.match_no < m.match_no AND mom_player IS NOT NULL ORDER BY prev.match_no DESC)
) AS mom_player,
match_opponent
FROM
mom_2004 AS m
ORDER BY
match_no;
Learn About SQL Subqueries:
Now, coming to the Query explanation:
SELECT Clause -
mom_player
:For each row in the result set, the
COALESCE
function is applied to themom_player
column.If
mom_player
is not NULL, it remains unchanged.If
mom_player
is NULL, the subquery is evaluated.
Subquery Execution (
mom_player
):The subquery
(SELECT mom_player FROM mom_2004 AS prev WHERE prev.match_no < m.match_no AND mom_player IS NOT NULL ORDER BY prev.match_no DESC LIMIT 1)
is executed.It looks for the most recent non-NULL value for
mom_player
from rows wherematch_no
is less than the current row'smatch_no
.The result is ordered by
match_no
in descending order andLIMIT 1
ensures only one result is retrieved.
COALESCE Evaluation (
mom_player
):The result of the
COALESCE
function is the first non-NULL value among its arguments.If the original
mom_player
is not NULL, it is used.If the original
mom_player
is NULL, the value retrieved from the subquery is used
I hope I did justice by explaining the solution fairly intuitively but don't lose heart if you cannot understand this method SQL has several tricks in its trade and I will share another such trick to solve this Question ( Volaaaaaa..)
method #2
let's create windows where we have nulls like:
Sachin Tendulkar |
null |
null |
what if we can make a group of these and then take the value of ' Sachin Tendulkar' and replace nulls with that?
SELECT *, count(mom_player) over(order by match_no rows between unbounded preceding and current row) groups
from mom_2004
Table view:
1 | Virender Sehwag | South Africa | 1 |
2 | Sourav Ganguly | South Africa | 2 |
3 | null | South Africa | 2 |
4 | Sachin Tendulkar | England | 3 |
5 | Sourav Ganguly | England | 4 |
6 | Sachin Tendulkar | England | 5 |
7 | null | England | 5 |
8 | null | Bangladesh | 5 |
9 | Rahul Dravid | Australia | 6 |
10 | null | Australia | 6 |
Now, our grouping is achieved. we have to take this as a subquery, take the first value from each group, and replace all the null values with that.
with mom_cte as (
SELECT *, count(mom_player) over(order by match_no rows between unbounded preceding and current row) groups
from mom_2004 )
SELECT match_no, first_value(mom_player) over(partition by groups ORDER by match_no ) mom_player, match_opponent
FROM mom_cte
Hopefully, this query is easier to comprehend and grasp.
Thank you, ram ram!
see you tomorrow folks <3
Subscribe to my newsletter
Read articles from ishwar kokkili directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by