Dealing with missing data
Detecting missing values
When you have no data, the empty database field contains the word NULL
Because NULL is not a number, it is not possible to use = , < or > to find or compare missing values.
To determine if a column contains a NULL value , use IS NULL and IS NOT NULL
Removing missing values
SELECT *
FROM Incidents
WHERE Shape IS NOT NULL
Imputing missing values (I)
What if you want to replace the missing values with another value instead of omitting them? You can do this using the ISNULL() function. Here we replace all the missing values in the Shape column using the word 'Saucer'
SELECT Shape, ISNULL(Shape, 'Saucer') AS NewShape
FROM Incidents
You can also use ISNULL() to replace values from a different column instead of a specified word.
-- Check the IncidentState column for missing values and replace them with the City column
SELECT IncidentState, ISNULL(IncidentState,City) AS Location
FROM Incidents
-- Filter to only return missing values from IncidentState
WHERE IncidentState IS NULL
Imputing missing values (II)
What if you want to replace missing values in one column with another and want to check the replacement column to make sure it doesn't have any missing values? To do that you need to use the COALESCE statement.
SELECT Shape, City, COALESCE(Shape, City, 'Unknown') as NewShape
FROM Incidents
Replace missing values in Country with the first non-missing value from IncidentState or City, in that order. Name the new column Location
-- Replace missing values
SELECT Country, COALESCE(Country, IncidentState, City) AS Location
FROM Incidents
WHERE Country IS NULL
Subscribe to my newsletter
Read articles from Rukayat Balogun directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by