DATA CLEANING AND PREPARATION PROJECT : Basic Data Cleaning Using Excel


Project Overview
In this project, I performed data cleaning and transformation on a large, messy sales dataset to ensure accuracy, consistency, and usability for analysis. The goal was to enhance data integrity and structure while optimizing it for further insights and visualization.
Steps Taken
1. Data Organization and Structuring
Renamed the sheets for better clarity.
Copied the raw, unclean data to a new sheet to preserve the original dataset.
Expanded columns to fit the data for better readability.
2. Data Enhancement
Added new columns to provide critical business insights, including:
Customer Name
Order Date
Ship Date
Estimated Time of Delivery
Region
Converted the dataset into a structured table for easier filtering and manipulation.
3. Data Cleaning Techniques Applied
Handling Missing Data:
Used
IF(A1="", A3, A1)
to fill blank customer names and region fields using the nearest available value.Date Formatting:
Corrected inconsistent date formats using
TEXT(A1, "YYYY-MM-DD")
to standardize them.Text-to-Columns Conversion:
Corrected numbers stored as text in quantity and price columns to ensure numerical operations function correctly.
Cleaning Numerical Data:
Used
ABS(SUBSTITUTE(SUBSTITUTE(A1, "two", "2"), "three", "3"))
to remove negative values in quantity and standardize incorrect representations of numbers (e.g., replacing "two" with 2 and "three" with 3).
4. Derived Metrics and Analysis
Estimated Time of Delivery Calculation:
Used
DATEDIF(Order Date, Ship Date, "D")
to determine the time taken for order fulfillment.Recalculated Total Price:
Ensured total price values were correct and consistent by recalculating them based on quantity and price.
Applied Conditional Formatting:
Used color scales to highlight significant trends in total price values, making anomalies and key insights more visible.
Freeze Panes for Better Viewing:
Used the freeze pane feature to keep the top row visible while scrolling, improving usability for large datasets.
Key Takeaways
Successfully cleaned and structured a large dataset, making it analysis-ready.
Ensured data consistency, accuracy, and improved readability through formatting and conditional formatting.
Applied advanced Excel functions to automate cleaning and enhance efficiency.
Strengthened my data analysis skills by transforming raw, unstructured data into a usable format for business insights.
This project showcases my expertise in Excel-based data cleaning, preparation, and transformation, essential for roles involving data analysis, business intelligence, and reporting.
Subscribe to my newsletter
Read articles from Mezu Nneoma directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Mezu Nneoma
Mezu Nneoma
I'm a healthcare professional exploring data analysis for public health. This blog documents my journey as I learn Excel, SQL, Power BI, and R, applying them to real-world healthcare data. I share insights, case studies, and tutorials on how data can drive better health outcomes.