Exploratory data analysis (EDA) on World Layoffs Data


INTRODUCTION
Exploratory data analysis (EDA) is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. EDA helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.
EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate. Originally developed by American mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used method in the data discovery process today.
WHY EDA IS IMPORTANT IN DATA ANALYSIS
The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, and find interesting relations among the variables. Data Analysts can use exploratory analysis to ensure their results are valid and applicable to any desired business outcomes and goals. EDA also helps stakeholders by confirming they are asking the right questions. EDA can help answer questions about standard deviations, categorical variables, and confidence intervals. Once EDA is complete and insights are drawn, its features can then be used for more sophisticated data analysis or modeling, including machine learning.
EXPLORATORY DATA ANALYSIS (EDA)USING SQL
SELECT MAX(total_laid_off)
FROM layoffs_staging2
;
# Returns the max layoffs in a single day
SELECT *
FROM layoffs_staging2
WHERE percentage_laid_off = 1
ORDER BY funds_raised_millions DESC
;
# Returns the companies that went under sorted by highest funds raised.
SELECT company, industry, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY company, industry
ORDER BY 3 DESC
;
# Returns the total laid off by company and industry in the three year period. Companies with the highest layoffs
SELECT MIN(`date`) AS Start_Date, MAX(`date`) AS End_Date
FROM layoffs_staging2
;
# Time period the data covers
SELECT industry, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY industry
ORDER BY 2 DESC
;
# Returns the industry with the most layoffs
SELECT Row_Number() Over(ORDER BY SUM(total_laid_off) DESC), country, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY country
ORDER BY 3 DESC
;
# Returns the country with the most layoffs
SELECT *
FROM layoffs_staging2
WHERE country = 'Nigeria'
ORDER BY 4 DESC
;
# Returns the companies with the highest layoffs in Nigeria
SELECT stage, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY stage
ORDER BY 2 DESC
;
# Returns the total layoffs by stage of company
SELECT MIN(`date`) AS Start_Date, MAX(`date`) AS End_Date
FROM layoffs_staging2
;
# Returns the time period the data covers
SELECT substring(`date`, 1,4) AS Years, SUM(total_laid_off)
FROM layoffs_staging2
WHERE substring(`date`, 1,4) is NOT NULL
GROUP BY Years
ORDER BY SUM(total_laid_off) DESC
;
# Returns the total layoffs by years.
SELECT substring(`date`, 1,7) AS Months, SUM(total_laid_off)
FROM layoffs_staging2
WHERE substring(`date`, 1,7) is NOT NULL
GROUP BY Months
ORDER BY 1
;
# Returns the total layoffs by months.
WITH Rolling_Total_CTE AS (
SELECT substring(`date`, 1,7) AS Months, SUM(total_laid_off) AS SUM_total_laid_off
FROM layoffs_staging2
WHERE substring(`date`, 1,7) is NOT NULL
GROUP BY Months
ORDER BY 1)
SELECT Months, SUM_total_laid_off, SUM(SUM_total_laid_off) OVER (ORDER BY Months) AS Rolling_Total
FROM Rolling_Total_CTE
;
#Returns a rolling total of layoffs by months. How the layoffs increased with months.
SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY company, industry, Years
ORDER BY 4 DESC
;
# Returns the total layoffs by company and years
WITH Company_CTE (company, industry, year,total_laid_off) AS (
SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY company, industry, Years
ORDER BY 4 DESC
)
SELECT *, DENSE_RANK () OVER (PARTITION BY Year ORDER BY total_laid_off DESC) AS Ranking
FROM Company_CTE
WHERE yearis NOT NULL
;
# Returns the top companies with the most layoffs each year
WITH Company_CTE (company, industry, year,total_laid_off) AS (
SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)
FROM layoffs_staging2
GROUP BY company, industry, Years
ORDER BY 4 DESC
),
Company_CTE_RAnk AS
( SELECT *, DENSE_RANK () OVER (PARTITION BY Year ORDER BY total_laid_off DESC) AS Ranking
FROM Company_CTE
WHERE yearis NOT NULL
)
SELECT *
FROM Company_CTE_RAnk
WHERE Ranking <=5
;
# Returns the top five Companies with the most layoffs each year
VISUALISATION
The dashboard gives the following insight into the data;
Total layoffs within the dataset during the period.
Maximum single-day layoff.
Number of companies sampled.
Number of countries sampled.
Total Layoffs By Industry
Layoffs By Year and Quarter
Layoffs By Company
Layoffs By Country
Highest Layoffs By Company each year
Findings
Key Figures
Total laid off globally: 365,000+ people
Max single-day layoffs: 12,000 people
Number of companies sampled: 797
Number of countries sampled: 39
Trends Over Time
2022 recorded the highest total layoffs (~53,100), with a significant spike compared to previous years. Layoffs steadily increased each year from 2020 to 2022, peaking in Q1 of 2023. Q4 of 2022 and Q1 of 2023 were particularly heavy with layoffs, indicating a likely response to economic tightening or market corrections.
Layoffs by Country
United States had by far the highest number of layoffs, followed by India, Netherlands, and Sweden. Other significantly impacted countries include Germany, Brazil, China, and Canada. Emerging economies such as Nigeria, Kenya, and Senegal were also represented but with smaller figures.
Layoffs by Industry
Consumer, Retail, and Tech-related industries were hit the hardest.
Other affected sectors include:
Finance
Healthcare
Education
Crypto
Marketing
Fitness
Infrastructure and Real Estate had relatively fewer layoffs.
Top Companies by Layoffs
Major tech firms had the highest layoffs:
Amazon: ~18K (≈5% of total)
Meta
Salesforce
Microsoft
Other notable companies include Philips, Ericsson, Uber, and Airbnb. Additional Observations Layoffs were not evenly distributed; large global firms were the primary sources of job cuts. Startup and funding trends may correlate with layoffs, as seen in the "Count of funds raised" bar overlays.
Summary
This dashboard paints a clear picture of a tech-driven layoff wave, primarily from mid-2022 into 2023, concentrated in the United States and tech-heavy countries, and dominated by major global companies. The data suggests economic or operational restructuring during this period, possibly due to post-COVID corrections, inflationary pressure, or over-hiring during the pandemic.
CONCLUSIONS
This project addressed the challenge of understanding the global wave of layoffs occurring between 2020 and 2023. Amid increasing reports of job losses, particularly in the tech sector, there was a lack of clear and accessible insight into where, when, and to what extent these layoffs were happening. By conducting exploratory data analysis (EDA) on a comprehensive dataset, the goal was to identify key patterns, affected industries and countries, and high-layoff companies, as well as to understand the broader economic context of these workforce reductions.
To solve this problem, SQL querying to explore and manipulate the dataset effectively was used. Also, EDA principles such as identifying patterns, outliers, and key distributions, along with Data Visualization using Power BI to bring clarity to trends, were at play. SQL was used to query and aggregate relevant data from the layoff dataset, revealing trends across time, geography, and sectors. Power BI was then used to visualize the data, making the insights intuitive and easy to communicate. Through this approach, the project demonstrated strong skills in data cleaning, query formulation, time series analysis, and interactive visualization.
This approach showcased a solid understanding of EDA principles and their importance in making informed decisions based on raw data, data wrangling and preparation, SQL for analytical querying, time series analysis (monthly/quarterly/yearly trends), visualization best practices to convey insights clearly and data-driven storytelling to summarize complex findings for business understanding.
The outcome was a set of clear insights: over 365,000 layoffs were recorded globally, with the highest impact in the United States and the tech industry, particularly in 2022 and early 2023. Major companies like Amazon, Meta, and Google were significant contributors. These findings highlight the value of data transparency and early detection of trends for organizations facing similar challenges. For businesses, the key takeaway is to maintain robust data practices, invest in visualization tools, and monitor related economic indicators such as funding trends and growth stages to anticipate workforce disruptions.
Subscribe to my newsletter
Read articles from Anthony Oghenejabor directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
