Exploratory data analysis (EDA) on World Layoffs Data

INTRODUCTION

Exploratory data analysis (EDA) is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. EDA helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.

EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate. Originally developed by American mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used method in the data discovery process today.

WHY EDA IS IMPORTANT IN DATA ANALYSIS

The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, and find interesting relations among the variables. Data Analysts can use exploratory analysis to ensure their results are valid and applicable to any desired business outcomes and goals. EDA also helps stakeholders by confirming they are asking the right questions. EDA can help answer questions about standard deviations, categorical variables, and confidence intervals. Once EDA is complete and insights are drawn, its features can then be used for more sophisticated data analysis or modeling, including machine learning.

EXPLORATORY DATA ANALYSIS (EDA)USING SQL

SELECT MAX(total_laid_off)

FROM layoffs_staging2

;

# Returns the max layoffs in a single day

SELECT *

FROM layoffs_staging2

WHERE percentage_laid_off = 1

ORDER BY funds_raised_millions DESC

;

# Returns the companies that went under sorted by highest funds raised.

SELECT company, industry, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY company, industry

ORDER BY 3 DESC

;

# Returns the total laid off by company and industry in the three year period. Companies with the highest layoffs

SELECT MIN(`date`) AS Start_Date, MAX(`date`) AS End_Date

FROM layoffs_staging2

;

# Time period the data covers

SELECT industry, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY industry

ORDER BY 2 DESC

;

# Returns the industry with the most layoffs

SELECT Row_Number() Over(ORDER BY SUM(total_laid_off) DESC), country, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY country

ORDER BY 3 DESC

;

# Returns the country with the most layoffs

SELECT *

FROM layoffs_staging2

WHERE country = 'Nigeria'

ORDER BY 4 DESC

;

# Returns the companies with the highest layoffs in Nigeria

SELECT stage, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY stage

ORDER BY 2 DESC

;

# Returns the total layoffs by stage of company

SELECT MIN(`date`) AS Start_Date, MAX(`date`) AS End_Date

FROM layoffs_staging2

;

# Returns the time period the data covers

SELECT substring(`date`, 1,4) AS Years, SUM(total_laid_off)

FROM layoffs_staging2

WHERE substring(`date`, 1,4) is NOT NULL

GROUP BY Years

ORDER BY SUM(total_laid_off) DESC

;

# Returns the total layoffs by years.

SELECT substring(`date`, 1,7) AS Months, SUM(total_laid_off)

FROM layoffs_staging2

WHERE substring(`date`, 1,7) is NOT NULL

GROUP BY Months

ORDER BY 1

;

# Returns the total layoffs by months.

WITH Rolling_Total_CTE AS (

SELECT substring(`date`, 1,7) AS Months, SUM(total_laid_off) AS SUM_total_laid_off

FROM layoffs_staging2

WHERE substring(`date`, 1,7) is NOT NULL

GROUP BY Months

ORDER BY 1)

SELECT Months, SUM_total_laid_off, SUM(SUM_total_laid_off) OVER (ORDER BY Months) AS Rolling_Total

FROM Rolling_Total_CTE

;

#Returns a rolling total of layoffs by months. How the layoffs increased with months.

SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY company, industry, Years

ORDER BY 4 DESC

;

# Returns the total layoffs by company and years

WITH Company_CTE (company, industry, year,total_laid_off) AS (

SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY company, industry, Years

ORDER BY 4 DESC

)

SELECT *, DENSE_RANK () OVER (PARTITION BY Year ORDER BY total_laid_off DESC) AS Ranking

FROM Company_CTE

WHERE yearis NOT NULL

;

# Returns the top companies with the most layoffs each year

WITH Company_CTE (company, industry, year,total_laid_off) AS (

SELECT company, industry, year(`date`) AS Years, SUM(total_laid_off)

FROM layoffs_staging2

GROUP BY company, industry, Years

ORDER BY 4 DESC

),

Company_CTE_RAnk AS

( SELECT *, DENSE_RANK () OVER (PARTITION BY Year ORDER BY total_laid_off DESC) AS Ranking

FROM Company_CTE

WHERE yearis NOT NULL

)

SELECT *

FROM Company_CTE_RAnk

WHERE Ranking <=5

;

# Returns the top five Companies with the most layoffs each year

VISUALISATION

The dashboard gives the following insight into the data;

Total layoffs within the dataset during the period.

Maximum single-day layoff.

Number of companies sampled.

Number of countries sampled.

Total Layoffs By Industry

Layoffs By Year and Quarter

Layoffs By Company

Layoffs By Country

Highest Layoffs By Company each year

Findings

Key Figures

Total laid off globally: 365,000+ people

Max single-day layoffs: 12,000 people

Number of companies sampled: 797

Number of countries sampled: 39

2022 recorded the highest total layoffs (~53,100), with a significant spike compared to previous years. Layoffs steadily increased each year from 2020 to 2022, peaking in Q1 of 2023. Q4 of 2022 and Q1 of 2023 were particularly heavy with layoffs, indicating a likely response to economic tightening or market corrections.

Layoffs by Country

United States had by far the highest number of layoffs, followed by India, Netherlands, and Sweden. Other significantly impacted countries include Germany, Brazil, China, and Canada. Emerging economies such as Nigeria, Kenya, and Senegal were also represented but with smaller figures.

Layoffs by Industry

Consumer, Retail, and Tech-related industries were hit the hardest.

Other affected sectors include:

Finance

Healthcare

Education

Crypto

Marketing

Fitness

Infrastructure and Real Estate had relatively fewer layoffs.

Top Companies by Layoffs

Major tech firms had the highest layoffs:

Amazon: ~18K (≈5% of total)

Google

Meta

Salesforce

Microsoft

Other notable companies include Philips, Ericsson, Uber, and Airbnb. Additional Observations Layoffs were not evenly distributed; large global firms were the primary sources of job cuts. Startup and funding trends may correlate with layoffs, as seen in the "Count of funds raised" bar overlays.

Summary

This dashboard paints a clear picture of a tech-driven layoff wave, primarily from mid-2022 into 2023, concentrated in the United States and tech-heavy countries, and dominated by major global companies. The data suggests economic or operational restructuring during this period, possibly due to post-COVID corrections, inflationary pressure, or over-hiring during the pandemic.

CONCLUSIONS

This project addressed the challenge of understanding the global wave of layoffs occurring between 2020 and 2023. Amid increasing reports of job losses, particularly in the tech sector, there was a lack of clear and accessible insight into where, when, and to what extent these layoffs were happening. By conducting exploratory data analysis (EDA) on a comprehensive dataset, the goal was to identify key patterns, affected industries and countries, and high-layoff companies, as well as to understand the broader economic context of these workforce reductions.

To solve this problem, SQL querying to explore and manipulate the dataset effectively was used. Also, EDA principles such as identifying patterns, outliers, and key distributions, along with Data Visualization using Power BI to bring clarity to trends, were at play. SQL was used to query and aggregate relevant data from the layoff dataset, revealing trends across time, geography, and sectors. Power BI was then used to visualize the data, making the insights intuitive and easy to communicate. Through this approach, the project demonstrated strong skills in data cleaning, query formulation, time series analysis, and interactive visualization.

This approach showcased a solid understanding of EDA principles and their importance in making informed decisions based on raw data, data wrangling and preparation, SQL for analytical querying, time series analysis (monthly/quarterly/yearly trends), visualization best practices to convey insights clearly and data-driven storytelling to summarize complex findings for business understanding.

The outcome was a set of clear insights: over 365,000 layoffs were recorded globally, with the highest impact in the United States and the tech industry, particularly in 2022 and early 2023. Major companies like Amazon, Meta, and Google were significant contributors. These findings highlight the value of data transparency and early detection of trends for organizations facing similar challenges. For businesses, the key takeaway is to maintain robust data practices, invest in visualization tools, and monitor related economic indicators such as funding trends and growth stages to anticipate workforce disruptions.

0
Subscribe to my newsletter

Read articles from Anthony Oghenejabor directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anthony Oghenejabor
Anthony Oghenejabor