Layoff Data Cleaning Project

pritydabhi02
Oct 11, 2025
1 min read

Updated: Mar 25

In this project we walk through the process used for cleaning the raw layoff's data

Layoff Data Cleaning & Exploratory Analysis

📌 ProblemCompanies faced difficulty understanding layoff trends due to inconsistent and unclean data.

🛠 Tools Used:-SQL (Window Functions, CTEs)

📊 Approach

Cleaned and standardized raw layoff data using SQL
Removed duplicates using ROW_NUMBER and handled missing values with self-joins
Performed exploratory analysis to identify trends across companies, industries, and time

🔍 Key Insights

Identified top 5 companies with highest layoffs using DENSE_RANK
Discovered clear year-over-year spikes during economic downturns
Highlighted monthly layoff trends for workforce planning

📈 ImpactThis analysis helps organizations and analysts better understand workforce trends and make proactive staffing decisions.

GitHub:- https://github.com/prt-tv/Priti-s-Porfolio/blob/main/Data%20Cleaning.%20sql.sql

First, let's take look at the data:

We need to check for duplicates first, let's do this by running a count on the rows that appear more than once based on key columns.

WITH duplicate_cte AS 
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY company, location, industry, total_laid_off, percentage_laid_off, `date`, stage, country, funds_raised_millions) AS row_num
FROM layoffs_staging
)
SELECT *
FROM duplicate_cte 
WHERE row_num > 1;

With this code we can see have multiple duplicates:

We need to remove these duplicates:

Now we have removed all the duplicates in our data.

Layoff Data Cleaning Project

Recent Posts

Comments