Layoff Data Cleaning Project
- pritydabhi02
- Oct 11, 2025
- 1 min read
Updated: Mar 25
In this project we walk through the process used for cleaning the raw layoff's data
Layoff Data Cleaning & Exploratory Analysis
š ProblemCompanies faced difficulty understanding layoff trends due to inconsistent and unclean data.
š Tools Used:-SQL (Window Functions, CTEs)
š Approach
Cleaned and standardized raw layoff data using SQL
Removed duplicates using ROW_NUMBER and handled missing values with self-joins
Performed exploratory analysis to identify trends across companies, industries, and time
š Key Insights
Identified top 5 companies with highest layoffs using DENSE_RANK
Discovered clear year-over-year spikes during economic downturns
Highlighted monthly layoff trends for workforce planning
š ImpactThis analysis helps organizations and analysts better understand workforce trends and make proactive staffing decisions.
First, let's take look at the data:

We need to check for duplicates first, let's do this by running a count on the rows that appear more than once based on key columns.
WITH duplicate_cte AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY company, location, industry, total_laid_off, percentage_laid_off, `date`, stage, country, funds_raised_millions) AS row_num
FROM layoffs_staging
)
SELECT *
FROM duplicate_cte
WHERE row_num > 1;With this code we can see have multiple duplicates:

We need to remove these duplicates:

Now we have removed all the duplicates in our data.



Comments