Return Logistics Optimization Using Python & Power BI

43 min read

Understanding the need for Supply Chain Analysis
Exploring the Dataset
- 📊 Observations from Product Revenue
- Now Observing the Units sold product wise:
- Revenue Per Unit-How Much Each Product Earns
- Total Revenue by Shipping Carrier-Delivery Matters
- Return Rate by Product Type-What’s being sent back?
- Let’s check what were the reasons the customers gave for returning
- Points to Note:
- Product Reasons for Return
- 💸 Refund Status Breakdown – Who Gets What Back?
- 🧠 What I Took Away
- Correlation Matrix – What Features Move Together?
- Return Rate vs Lead Time
Analysing SKU’s
- Top 20 SKU by orders and stock
- 🚛 Total Shipping Cost by Carrier – What Are We Spending?
- 🧯 Supplier vs Product – Who’s Really Driving the Defects?
- 🕵️‍♂️ Inspection Results vs Defect Rates – Are We Really Catching the Issues?
- 💡 What I Take From This
- 📅 Monthly Trends – Sales, Defects & Returns Over Time
- 🔄 Relationship Insights
- 🔍 Key Takeaways
"While defect rates give us a glimpse into product quality, the real customer impact is better captured by looking at what gets returned — and when."
- Monthly Product Returns- Whats Going Back and When?
- 🧠 What I Take From This
- 🌍 Customer Segmentation – Who’s Driving Business (and Complaints)?
- 💰 Revenue by Segment and Region (Bar Plot)
- 📉 Decomposing Returns: Trend, Seasonality & What Lies Beneath
- 📈 Decomposing Sales: Trend, Seasonality & Residuals
- 🧠 What I Learned
Key Observations from Returns Decomposition Analysis
- 📊 Trend Analysis: Stability Over Time
- 🔄 Seasonality: Holiday Return Surge
- ⚡ Residuals: Unexpected Events
- 🎯 Business Impact & Action Items
- 🕒 Lead Time Component Analysis – What's Slowing Us Down?
- 🎛️ Simulation: Setting the Stage with Clean Monthly Data
🔮 When Algorithms Try to Predict the Future: A Tale of 5 Products
🎯 How I Picked the “Winner” Algorithm (a.k.a. My Forecasting Fight Club)
- ⚖️ The Weapon of Choice: MAE (Mean Absolute Error)
- 🧪 Let the Tournament Begin (Time Series Cross-Validation Style)
🏁 And the Results Are In...
🤔 So... Why These Models?
- 🧼 Moving Average: The Calm in the Storm (3 Wins)
- 🧃 Naive: The "Shrug Emoji" That Nailed Headphones
- 🔮 Prophet: The Fancy Forecasting Intern Who Nailed Laptops
🧠 Real Talk: What I Learned
- 1. Simple Models Win Fights
- 2. Volatile Data = Rage Quits for Algorithms
- 3. Prophet’s a Control Freak
💡 If You’re Doing This Too, Here’s the Dev Advice:
TL;DR – My Forecasting Fight Club Rules 🥋
📊 The Good, The Bad, and the “Wait, What?”
- 🎧 Headphones: The Rollercoaster That Broke Everyone's Brain
- 💻 Laptops: The Steady Eddie
- 📱 Smartphones: The Plot Twist Master
- ⌚ Smartwatches: The Underdog Story
- 📱 Tablets: The Seasonal Performer
🎯 What I Learned (Without the Buzzwords)
- ✅ 1. Simple Often Wins
- ⚠️ 2. Volatile Data = Forecasting Nightmare
- 📈 3. Trend Models Are Hopeless Optimists
- 😅 4. Prophet Has Trust Issues
🤖 What This Means for Devs Like Us
💡 Final Word
🧠 Simulation Phase: Predicting, Stress-Testing & Learning from Returns
- 🛠️ What Was Simulated and Why?
- 🔍 Step-by-Step: What Happens in the Simulation Engine?
- 🎲 What is Monte Carlo Simulation?
🔍 Forecasting Performance – What Worked & What Didn’t
- 🎧 Headphones – Spikey & Unpredictable
- 💻 Laptops – Surprisingly Predictable
- 📱 Smartphones – Model Overshoots, Market Shrugs
- ⌚ Smartwatches – Emerging but Solid
🔥 Return Cost Risk Zones – Where the Trouble Hits
- 🗓️ Q4 2024: The Danger Zone
- 📊 Category-Wise Risk Summary
Supply Chain Simulation Report
⚙️ Tech Implementation Insights
- ✅ Model Performance Was 🔥
🎲 Monte Carlo Distribution Patterns
- 📈 Distribution Shapes by Product:
- 📊 Volatility Snapshot: (95th Percentile vs Mean)
💼 Business Risk Insights
- 💸 Cost Exposure Breakdown
- 🚨 Operational Red Flags Across the Board
👨‍💻 Developer POV: What I Learned (And What Broke)
- 🔎 1. Data Quality Smells
- 🧱 2. Simulation Architecture Held Up Well
- ⚠️ 3. Technical Debt = Business Risk
Finally!! Monte Carlo Analysis
- 📱 Smartphones: The Drama Queen
- 💻 Laptops: The Steady Worrier
- 🎧 Headphones: The People Pleaser
- ⌚ Smartwatches: The Overachiever
- 📱 Tablets: The Wildcard
🚨 The “Oh Crap” Moments
- 1. SLA Glitch or Cosmic Joke?
- 2. Customers Are Actively Mad
- 3. We’re Awesome at Predicting Pain
💰 What I'd Do If This Were My Money
- 🔧 Quick Fixes
- ⏳ Medium-Term Fixes
- 🌐 Long-Term Strategy
🎲 Why Monte Carlo?
📌 Final Word
🧵 Wrapping It Up: Not a Nerd, Just Curious (And Sleep-Deprived) 😵‍💫
- 🫠 Real Talk: I'm Just a College Guy Doing My Best
- 📬 Slide into the Comments Like:
- Final Thought 🎯

Today’s fast-paced e-commerce innovations like 1-click delivery have revolutionized the way we shop. These online platforms have become a masterstroke in human convenience, solving multiple logistical challenges and reshaping the traditional shopping mindset. But as with every breakthrough, this digital revolution brings its own set of challenges.
One such challenge is the rise of product returns and exchanges.
Once considered a minor afterthought has now become a core pillar in modern supply chain operations. Especially in the consumer electronics sector, returns are not just frequent but are expected.
From manufacturing defects to wrong deliveries and rapidly changing preferences, the reasons are diverse, but the impact is undeniable.

Understanding the need for Supply Chain Analysis

Since returns are no longer just a customer service issue, they now directly affect warehousing, inventory planning, future trends, and customer perception. Frequent returns can disrupt stock levels, increase lead times, and raise operational costs, reducing overall efficiency.
This is where supply chain analytics becomes essential.
By analyzing return data and other performance metrics, we can identify bottlenecks and causes of losses or returns, improving customer satisfaction throughout the supply chain.

Exploring the Dataset

To perform the supply chain analysis, I began with a real-world styled dataset that records the entire lifecycle of an order from manufacturing to return.
The dataset includes features like:

Order details: SKU, Order_Date, Product_Type, Price, Number_Products_Sold, Revenue_Generated
Customer & region data: Customer_Region, Customer_Segment
Supplier-side metrics: Supplier_Name, Supplier_Defect_Rate, Inspection_Result, Manufacturing_Lead_Time
Return information: Defect_Rate, Return_Reason, Return_Date, Is_Return
Logistics: Shipping_Carrier, Shipping_Cost, Transport_Mode, Lead_Time, Shipping_Time
Outcome data: Refund_Status, Notes, Order_Month, Availability

With such a comprehensive dataset, I conducted:
Exploratory Data Analysis(EDA) to understand patterns in returns, defects,delays and costs
Forecasting Scripts(using models like ARIMA/Prophet) to predict future return volumes

Simulation scripts to run “what if” scenarios (eg, lead time delays, defect rate spikes, or policy changes)

Analysis Using Python

Before jumping into forecasting or simulations, it was essential to thoroughly explore and understand the dataset. The main objective of this phase was to uncover hidden patterns, outliers, and correlations that could explain:

Why returns happen
What factors influence them the most

How do different operational aspects (like shipping, inspection, or supplier defects) contribute to overall costs.

Let's get started by importing the necessary Python libraries and the dataset.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.ticker as mtick
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go

import sys
import os
sys.path.append('../scripts')

import data_loader # type: ignore
from forecasting_models import run_forecasting_engine_powerbi #type: ignore
from simulation_engine import run_simulation_engine #type: ignore
sns.set(style='whitegrid')
print("✅ All modules imported successfully!")

df = pd.read_csv('../data/data.csv')  
df.head()

Lets have a look at desprictive statistics of the dataset

df.info()
df.isnull().sum()


df['Return_Date'] = pd.to_datetime(df['Return_Date'])


df['Lead_Time'].fillna(df['Lead_Time'].median(), inplace=True)


df['Return_Week'] = df['Return_Date'].dt.isocalendar().week

df.info()- gives an overview of the DataFrame.
used to understand which columns might have missing values or incorrect data types before preprocessing.

df.isnull() returns a DataFrame of True/False for each cell(True if its null)
.sum() adds up the true values column-wise.

df['Return_Date'] = pd.to_datetime(df['Return_Date'])
converts the ‘Return_Data column from string format to proper datetime64 format using pandas.
To enable date operations, like extracting the week, month, or calculating durations.

.fillna(…,inplace=True) fills missing values in that column with the median directly in the DataFrame.

df['Return_Week'] = df['Return_Date'].dt.isocalendar().week extracts the week number of the return date.

Total Entries: 15,000 records
Inspection_Result→ ~12% missing (only 13159 non null values)
indicating quality check issues.
Return_Reason, Return_Date, Refund_Status → Present only for 2,425 entries
Supplier_Defect_Rate, Inspection_Result, Defect_Rate → Quality tracking
Shipping_Cost, Transport_Mode, Shipping_Time, Lead_Time → Logistics data
Customer_Region, Segment, SKU, Product_Type → Market segmentation

revenue_by_product = df.groupby('Product_Type')['Revenue_Generated'].sum()
units_sold = df.groupby('Product_Type')['Number_Products_Sold'].sum()
ship=df.groupby('Shipping_Carrier')['Revenue_Generated'].sum().reset_index()
revenue_per_unit = revenue_by_product / units_sold
avg_revenue_per_unit = df.groupby('Product_Type').apply(
    lambda x: x['Revenue_Generated'].sum() / x['Number_Products_Sold'].sum()
)


summary = pd.DataFrame({
    'Total_Revenue': revenue_by_product,
    'Total_Units_Sold': units_sold,
    'Avg_Revenue_Per_Unit': avg_revenue_per_unit
}).reset_index() 

fig = px.bar(summary, 
             x='Product_Type', 
             y='Total_Revenue', 
             title='Total Revenue by Product Type',
             text='Total_Revenue')

fig.show()

🏷️ Grouped by Product_Type to analyze performance across categories.
💰 Total Revenue: Calculated how much revenue each product type generated overall.
📦 Total Units Sold: Checked how many units were sold per product type.
📈 Avg Revenue per Unit: Computed by dividing total revenue by units sold — shows earning power per product.
📊 Visualization: Created a bar chart to highlight top revenue-generating products at a glance.
🎯 Insight: Helps identify high-volume vs high-margin products, guiding both pricing and inventory strategy.

📊 Observations from Product Revenue
- 📱 Smartphones and Laptops dominate the revenue chart — both contributing nearly ₹130M+ each over the 2 years.
- 🎧 Headphones, Smartwatches, and Tablets trail far behind, each generating around ₹10M, a stark contrast in revenue contribution.
- 🔍 This clearly shows that the company’s revenue is heavily dependent on high-ticket products like smartphones and laptops.
- ⚖️ The disparity also raises questions — are the lower-selling products underperforming, or simply not promoted as aggressively?
- 📈 This insight can guide product focus, marketing budgets, and even inventory decisions going forward.

Now Observing the Units sold product wise:

pie_chart=px.pie(units_sold,
                 values='Number_Products_Sold',
                 names=units_sold.index,
                 title='Units Sold by Product Type',
                 hole=0.5,
                 color_discrete_sequence=px.colors.qualitative.Pastel)
pie_chart.update_traces(textposition='inside', textinfo='percent+label')
pie_chart.show()

As expected, smartphones not only generated the highest revenue but also had the highest number of units sold.

Laptops followed the same trend, securing the second spot in both revenue and sales volume.
Smartwatches and tablets, which showed lower revenue earlier, also contributed a smaller share of total units sold.
The alignment between units sold and revenue highlights the company’s reliance on high-demand product categories.
This consistency suggests a healthy correlation between product popularity and profitability.

Revenue Per Unit-How Much Each Product Earns

💻 Laptops lead the pack with the highest revenue per unit — over ₹1300 on average. Not surprising, since they’re premium products with high price tags.
📱 Smartphones come next, generating around ₹824 per unit — a strong performer given their mass-market appeal and consistent demand.
📊 Tablets and Smartwatches fall into the mid-range, earning ₹574 and ₹390 per unit respectively — pointing toward moderate pricing and margins.
🎧 Headphones earn the least per unit (₹186), suggesting they’re low-margin, high-volume items — possibly used more to complement sales or offer combos.

Total Revenue by Shipping Carrier-Delivery Matters

fig=go.Figure()
fig.add_trace(go.Bar(x=ship['Shipping_Carrier'],
                     y=ship['Revenue_Generated']))
fig.update_layout(title='Total Revenue by Shipping Carrier',
                  xaxis_title='Shipping Carrier',
                  yaxis_title='Revenue Generated')
fig.show()

📦 I analyzed how much revenue was associated with each shipping carrier.
🚀 Carrier_X stood out as the top performer, contributing the highest revenue over the 2-year period.
📈 Carrier_Y followed closely, showing it’s also a reliable contributor to overall logistics performance.
✅ Carrier_Z generated a decent amount of revenue — not leading, but still contributing steadily.
⚠️ Carrier_W, however, lags significantly behind the others, indicating room for improvement in either market coverage or operational efficiency.
🔍 These patterns could point to differences in carrier reliability, speed, or customer satisfaction — worth exploring further.

Return Rate by Product Type-What’s being sent back?

sns.set_style("whitegrid")
sns.set_context("talk", font_scale=1.1)

return_counts = df['Product_Type'].value_counts()
total_sales = df.groupby('Product_Type')['Number_Products_Sold'].sum()
return_rate = (return_counts / total_sales).sort_values(ascending=False)


plt.figure(figsize=(14,7))
palette = sns.color_palette("coolwarm", len(return_rate))

bars = sns.barplot(x=return_rate.index, y=return_rate.values, palette=palette)


for bar in bars.patches:
    height = bar.get_height()
    bars.annotate(f'{height:.1%}', 
                  (bar.get_x() + bar.get_width() / 2, height), 
                  ha='center', va='bottom', fontsize=11, fontweight='bold', color='#444')

plt.title('Return Rate by Product Type', fontsize=20, fontweight='bold', pad=15)
plt.ylabel('Return Rate', fontsize=14)
plt.xlabel('Product Type', fontsize=14)
plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter(1.0))
plt.xticks(rotation=40, ha='right')
plt.ylim(0, max(return_rate.values)*1.15)
plt.tight_layout()
plt.show()

⌚ Smartwatches top the return rate chart at 7.8%. This could be due to compatibility issues, overhyped expectations, or customers realizing they don’t need it after purchase.
💊 Tablets follow with a 5.8% return rate — likely reasons might include performance dissatisfaction, screen issues, or overlap with smartphones.
🎧 Headphones show a 5.2% return rate. That’s possibly due to sound quality concerns, comfort issues, or defects in delivery (like one side not working).
💻 Laptops, despite being high-value products, have a moderate 3.9% return rate — which could stem from technical glitches, wrong specifications, or slower-than-expected performance.
📱 Smartphones surprisingly have the lowest return rate at 3.1%, possibly because they’re more familiar, consistent, and customers know exactly what they’re getting.

Let’s check what were the reasons the customers gave for returning

returned_units = df[df['Return_Reason'].notnull()].groupby(['Product_Type', 'Return_Reason'])['Number_Products_Sold'].sum()
return_rate_reason= (returned_units/total_sales).sort_values(ascending=False)
print(return_rate_reason)

Digging deeper into returns, I looked at not just how often products are returned, but why. Here’s what stood out:

Defective products were the top reason for returns across most categories — especially for:
- Tablets (5.1% of sold units)
- Smartwatches (5.0%)
- Laptops and Smartphones also had defect-related returns close to 4.5%

These numbers suggest possible issues in quality control or supplier defects, especially in smart devices.

Late Delivery triggered a high return rate for:
- Headphones (4.8%) — the highest in this category
- Laptops, Smartphones, and Tablets also saw significant late-delivery-related returns

This points to logistics issues, possibly from specific carriers or warehouse bottlenecks.

Changed Mind was another recurring theme, particularly for:
- Tablets (4.0%)
- Laptops, Smartwatches, and Smartphones

This could be due to impulse buying, unclear product specs, or better options available post-purchase.

Wrong Product Delivered and ‘Other’ reasons were less frequent, but still present across all categories.

Points to Note:

Defects and delivery delays are the two biggest return drivers-which are both operationally solvable

Categories like Smartwatches and Tablets are specially sensitive to quality or experience gaps.

Product Reasons for Return

grouped_data = returns.groupby(['Product_Type', 'Return_Reason']).size().reset_index(name='Return_Count')

fig = px.bar(
    grouped_data,
    x='Product_Type',
    y='Return_Count',
    color='Return_Reason',
    barmode='group',
    title='Return Reasons by Product Type (Grouped)'
)
fig.show()

🛠️ The most common reason for product returns is defective items being delivered — across almost all product types.
🎧 Headphones show high return rates not just due to defects, but also because of late deliveries, which likely frustrate customers expecting quick fulfillment.
💻 Laptops also see significant returns due to delayed shipping, which can impact time-sensitive buyers like students or professionals.
🧯 The dominance of defect-related returns highlights the need for stronger quality control and supplier accountability.
🚛 On the other hand, the impact of delivery delays shows how much logistics and customer experience influence return behavior — not just product quality.

💸 Refund Status Breakdown – Who Gets What Back?

To understand the financial impact of product returns, I explored how returned items were handled under different refund statuses.

Using a grouped bar chart, I broke down the number of returns by Refund_Status, along with the return rate percentage attached to each:

refund_counts=df[df['Return_Date'].notnull()]['Refund_Status'].value_counts().reset_index()
refund_counts.columns = ['Refund_Status', 'Count']
refund_counts['Return_Rate (%)'] = (refund_counts['Count'] / refund_counts['Count'].sum()) * 100
fig=px.bar(refund_counts,
           x='Refund_Status',
           y='Count',
           text='Return_Rate (%)',
           color='Refund_Status',
           labels={'Count': 'Number of Refunds'})

fig.update_traces(textposition='outside', texttemplate='%{text:.2f}%')
fig.update_layout(yaxis_title='Returns', xaxis_title='Refund Status', showlegend=False)
fig.show()

Refund Status	💰 Financial Impact	🔍 Possible Insight
Fully Refunded	✅ High cost	Reflects a customer-centric policy — full refunds are costly but may preserve brand trust and long-term loyalty.
Pending	⏳ Operational lag	Indicates ongoing processing or delays — could be due to backlog, manual inspection, or unclear policies. High pending volume might frustrate customers.
Denied	💸 Low refund cost	Suggests strict return validation — minimizes losses, but if not transparent, can hurt customer satisfaction and lead to disputes or negative sentiment.

🧠 What I Took Away
- A high proportion of full refunds signals a customer-first approach, which is great for brand trust — but it also comes with a significant operational cost if defects or avoidable issues are the root cause.
- A growing “Pending” segment feels like a red flag — possibly pointing to processing delays, manual verification bottlenecks, or unclear SOPs. If left unchecked, this can hurt the post-purchase experience.
- Denied refunds, while minimizing cost, must be justified and transparent. If customers feel unfairly denied, it risks eroding trust and triggering complaints, especially if denial reasons aren’t communicated clearly.

Correlation Matrix – What Features Move Together?

numeric_cols = [
    'Price',
    'Number_Products_Sold',
    'Revenue_Generated',
    'Lead_Time',
    'Manufacturing_Lead_Time',
    'Shipping_Time',
    'Shipping_Cost',
    'Supplier_Defect_Rate',
    'Defect_Rate'
]


corr = df[numeric_cols].corr()

plt.figure(figsize=(12,10))
sns.heatmap(corr, annot=True, cmap='coolwarm', fmt='.2f', square=True)
plt.title('Correlation Matrix of Key Numeric Features', fontsize=16, fontweight='bold')
plt.show()

💰 Price and Revenue Generated (0.74)

Strong positive correlation here- as expected,higher-proced products generate more revenue, assuming the demand remains steady.
📦 Price and Shipping Cost (0.87)

A very strong correlation suggests that costlier products tend to be more expensive to ship due to size, weight or the need for special handling.
🧯 Supplier Defect Rate vs Defect Rate (very high)
A strong positive correlation shows that supplier quality directly impacts the final defect rate. This reinforces how critical supplier reliability is in the overall product experience.
🕒 Lead Time and Defect Rate (-0.33)
A moderate negative correlation here is interesting — it might mean that longer lead times (possibly due to careful handling or more thorough inspection) result in fewer defects.
For example, if I want to reduce defects, the answer might not lie in faster shipping but in choosing better suppliers or allowing slightly longer lead times.
Similarly, logistics planning for high-priced items must account for proportionally higher shipping costs, which could impact profit margins.

Return Rate vs Lead Time

How the lead time impacts return behaviour.

plt.figure(figsize=(12,6))
sns.set_style("whitegrid")


grouped = df.groupby(['Lead_Time', 'Product_Type'])['Defect_Rate'].mean().reset_index()

sns.lineplot(data=grouped, x='Lead_Time', y='Defect_Rate', hue='Product_Type', marker='o')

plt.title("Average Return Rate vs Lead Time by Product Type", fontsize=16, weight='bold')
plt.xlabel("Lead Time (Days)")
plt.ylabel("Avg. Return Rate (%)")
plt.legend(title='Product Type')
plt.tight_layout()
plt.show()

🕐 General Trend – Returns Drop as Lead Time Increases (Initially)

Across most product types, return rates tend to decrease as lead time increases, especially up to around 30–35 days.
Short lead times (under 15 days) are often linked to higher return rates — possibly due to:
- Rushed quality checks
- Inaccurate order handling
- Customers quickly reconsidering purchases

⚠️ Volatile Return Rates at Short Lead Times (5–15 Days)

Returns are most unpredictable and elevated when lead times are short.
🎧 Headphones spiked heavily around a 9-day lead time, with return rates exceeding 8% — a clear outlier.
Other product types also show instability in this range, though less extreme.

✅ Stabilization at Mid-Range Lead Times (25–45 Days)

Between ~25 and 45 days, return rates stabilize and drop to around 2–3% across the board.
This might reflect an “optimal window” where:
- Products are properly processed
- Delivery timelines meet expectations
- Customer satisfaction is higher

📉 Volatility Returns at Longer Lead Times (>45 Days)

Beyond 45 days, return rates become erratic again.
⌚ Smartwatches show a sharp spike (5.5%) around 57–58 days.
📱 Smartphones and Tablets also show fluctuations — possibly due to:
- Customer frustration or losing interest
- Delayed needs or seasonal misalignment
- Smaller sample size in this segment

🔍 Product-Specific Observations

🎧 Headphones: Highest return rate at shortest lead time (possible quality/logistics issue).
💻 Laptops: Relatively stable and low return rates regardless of lead time — suggests strong consistency and customer satisfaction.
⌚ Smartwatches: Large spike at long lead time → may indicate issues with handling or customer expectations.

Analysing SKU’s

SKU stands for Stock Keeping Units. They’re like special codes that help companies keep track of all the different things they have for sale. Imagine you have a large toy store with lots of toys. Each toy is different and has its name and price, but when you want to know how many you have left, you need a way to identify them. So you give each toy a unique code, like a secret number only the store knows. This secret number is called SKU.

sku_revenue = df.groupby('SKU')['Revenue_Generated'].sum().sort_values(ascending=False).head(20)

fig = px.bar(sku_revenue.reset_index(),
             x='SKU', y='Revenue_Generated',
             title='Top 20 SKUs by Revenue')
fig.show()

Top 20 SKU by orders and stock

total_sku= sku_summary.nlargest(20,'Total_Sold')
melted = total_sku.melt(id_vars='SKU', 
                         value_vars=['Total_Stock', 'Total_Sold'], 
                         var_name='Metric', 
                         value_name='Value')




fig = px.bar(melted,
             x='SKU', 
             y='Value', 
             color='Metric', 
             barmode='group',
             title='Top 20 SKUs: Stock vs Units Sold')

fig.show()

🚛 Total Shipping Cost by Carrier – What Are We Spending?

shipping_cost_by_carrier = df.groupby('Shipping_Carrier')['Shipping_Cost'].sum().reset_index()
shippping_cost_chart=px.bar(shipping_cost_by_carrier,x='Shipping_Carrier',
                     y='Shipping_Cost',
                     title='Total Shipping Cost by Carrier',
                     color='Shipping_Carrier')
shippping_cost_chart.show()

📦 Carrier X recorded the highest shipping cost at around ₹223.7K — not surprising, since it also handled the most high-revenue deliveries.
💰 This suggests that while Carrier X is expensive, it’s also driving the bulk of our business — possibly used for premium or high-volume products.
📉 Other carriers show lower shipping costs, which could be due to:
- Handling fewer orders
- Specializing in specific regions or product types
- Being assigned to lower-value shipments
🧠 It’s a good reminder that high cost isn’t always bad — in this case, it may reflect reliability, faster service, or strategic priority routes.🧯 Supplier vs Product – Who’s Really Driving the Defects?

🧯 Supplier vs Product – Who’s Really Driving the Defects?

To pinpoint the source of high defect rates, I broke it down by both supplier and product type using a heatmap. Here's what I uncovered:


heat_df = df.groupby(['Supplier_Name', 'Product_Type'])['Defect_Rate'].mean().unstack()

plt.figure(figsize=(12,6))
sns.heatmap(heat_df, annot=True, cmap="Reds", fmt=".2%")
plt.title("Average Defect Rate by Supplier and Product Type")
plt.xlabel("Product Type")
plt.ylabel("Supplier")
plt.xticks(rotation=45)
plt.show()

📊 Major Variation Across Suppliers

The average defect rates vary drastically depending on the supplier.
🟢 Supplier_A consistently has the lowest defect rates, staying in the 140–150% range — suggesting excellent quality control.
🟡 Supplier_B and Supplier_D fall into a moderate band (~190–300%).
🔴 Supplier_C and Supplier_E have alarmingly high defect rates, hovering around 400–500%, making them the biggest contributors to quality issues.

🏭 Supplier Matters More Than Product Type

For any given supplier, the defect rate remains nearly the same across all products.
This suggests that supplier-level quality processes (not product complexity) are the main drivers of defects.
Example: Supplier_C performs equally poorly across every product — from headphones to tablets — all hovering close to ~500%.

🥇 Best vs Worst Performers

✅ Best Performer: Supplier_A stands out with consistently low defect rates — clearly the most reliable vendor.
❌ Worst Performers: Supplier_C and Supplier_E are responsible for the highest defect rates, several times worse than Supplier_A.

🧠 Key Takeaway

Switching or auditing poor-performing suppliers could drastically improve product quality across the board.
Investing in Supplier_A-like standards or moving more volume to them might be a smart strategy.

🕵️‍♂️ Inspection Results vs Defect Rates – Are We Really Catching the Issues?

To assess how effective inspections are in filtering out defective products, I compared defect rates based on inspection results ("Pass" vs "Fail") across all suppliers. Here’s what I discovered:


heat_df = df.groupby(['Supplier_Name', 'Inspection_Result'])['Defect_Rate'].mean().unstack()

plt.figure(figsize=(12,6))
sns.heatmap(heat_df, annot=True, cmap="Reds", fmt=".2%")
plt.title("Average Defect Rate by Supplier and Inspection Result")
plt.xlabel("Inspection Result")
plt.ylabel("Supplier")
plt.xticks(rotation=45)
plt.show()

❌ “Fail” Inspection Results Show Higher Defects — As Expected

For every supplier, items that fail inspection have much higher average defect rates.
📉 Example: Supplier_A shows a 353.56% defect rate on failed items vs 124.66% on passed ones — a consistent trend across the board.
This confirms that inspections do help flag high-defect batches, but...

😬 Even “Pass” Results Have Alarmingly High Defect Rates

No supplier has a defect rate under 100%, even for products that passed inspection.
🔴 Supplier_C is the worst offender, with 476.43% defect rate on passed items — meaning, on average, more than 4 defects per item!
🟢 Even Supplier_A, the best performer, still logs 124.66% defects for "Pass" — suggesting a serious gap in inspection standards.

🏭 Supplier Differences Still Dominate

Just like the earlier supplier-product analysis, supplier quality matters more than the inspection outcome.
The best-performing supplier (Supplier_A) shows lower defects in both Pass and Fail categories, while Supplier_C remains the worst.
Even when products “pass,” the supplier's base quality level seems to dominate the defect count.

🚨 Key Red Flags

✅ "Pass" clearly doesn’t mean defect-free — in fact, defect rates >100% hint that multiple defects per item are common.
🤖 This could mean:
- Inspections are superficial or incomplete
- The pass criteria are too loose
- Or the defect rate metric is based on multiple types of issues per item

💡 What I Take From This

Inspection processes need revamping — especially for suppliers with already high base defect levels.
More importantly, we need to ask: is the inspection process actually catching the root problems, or just checking boxes?
Trusting "Pass" blindly could be costing the company post-shipment returns, dissatisfaction, and repair costs.

📅 Monthly Trends – Sales, Defects & Returns Over Time

To understand how our operations evolve over time and whether seasonality or scaling affects quality, I analyzed monthly trends for:

🛒 Units Sold
🧯 Defect Rates
🔁 Return Volume

df['Order_Month']=pd.to_datetime(df['Order_Date']).dt.to_period('M')
df['Return_Month']=pd.to_datetime(df['Return_Date']).dt.to_period('M')


monthly_sales = df.groupby('Order_Month').agg({
    'Number_Products_Sold': 'sum',
    'Defect_Rate': 'mean',
}).reset_index()
monthly_returns = df[df['Return_Date'].notnull()].groupby('Return_Month').size().reset_index(name='Returns')

monthly_sales['Order_Month'] = monthly_sales['Order_Month'].dt.to_timestamp()
monthly_returns['Return_Month'] = monthly_returns['Return_Month'].dt.to_timestamp()


fig_main = go.Figure()


fig_main.add_trace(go.Scatter(
    x=monthly_sales['Order_Month'],
    y=monthly_sales['Number_Products_Sold'],
    mode='lines+markers',
    name='Sales'
))


fig_main.add_trace(go.Scatter(
    x=monthly_sales['Order_Month'],
    y=monthly_sales['Defect_Rate'] * 100,
    mode='lines+markers',
    name='Defect Rate (%)',
    yaxis='y2'
))

fig_main.update_layout(
    title='📊 Monthly Sales & Defect Rate',
    xaxis_title='Month',
    yaxis=dict(title='Units Sold', rangemode='tozero'),
    yaxis2=dict(
        title='Defect Rate (%)',
        overlaying='y',
        side='right',
        showgrid=False,
        rangemode='tozero'
    ),
    legend=dict(x=0, y=0)
)

fig_main.show()


fig_returns = go.Figure()

fig_returns.add_trace(go.Scatter(
    x=monthly_returns['Return_Month'],
    y=monthly_returns['Returns'],
    mode='lines+markers',
    name='Returns',
    line=dict(color='crimson')   
))

fig_returns.update_layout(
    title='📉 Monthly Returns',
    xaxis_title='Month',
    yaxis_title='Units Returned',
    yaxis=dict(rangemode='tozero'),
    showlegend=False
)

fig_returns.show()

📈 Sales Trends

🟢 Strong initial growth: Sales were low in May 2024 (~2.5K units) but jumped dramatically to ~34K units by June — a rapid scale-up.
📉 Gradual decline: Post-June, sales slowly dipped over the next several months, hitting a low of ~21K in Feb 2025.

🔁 Mild recovery: There was a rebound in March (~30K), followed by another soft dip to ~25K by May 2025.

🧪 Defect Rate Behavior

🚨 Extremely high starting point: Defect rate was ~270% in May, showing early operational quality issues.
✅ Quick improvement: Dropped to ~260% by June, right when sales spiked.

📊 Stable plateau: From June onward, the defect rate remained consistently high (~250–270%), showing no strong improvement or deterioration regardless of sales volume.

🔁 Monthly Returns

📦 Returns mirror sales lag: May returns were negligible (matching low sales), but surged in June–July as sales took off.
⚖️ Stable fluctuations: From July to March, returns hovered around 180–220 units/month, with notable peaks in:
- 🗓️ November 2024 (~215 units)
- 🗓️ March 2025 (~210 units)
📉 Sharp drop in May 2025: Returns dropped dramatically to ~70 units, potentially due to declining sales or improved delivery/inspection processes.

🔄 Relationship Insights

🧠 Sales vs Defects

The initial sales ramp-up coincided with a drop in defect rates, suggesting that scaling operations actually improved early quality control.
However, post-June, there’s no strong correlation between sales volume and defects — defects remained high and steady despite sales fluctuation.
Takeaway: Quality issues may be rooted deeper — likely supplier-related (as we’ve seen), not tied to operational load.

📦 Returns vs Sales

Returns are clearly sales-driven, with a lag effect (returns rise after sales spikes).
But the return rate (returns/sales) may not have improved significantly — it just scaled with volume until the recent drop in May.

🔍 Key Takeaways

📆 Monthly tracking is essential for spotting capacity or quality problems before they scale.
❗ Our defect rate is consistently high, even with varying sales — and needs to be tackled at the supplier level, not just post-production.
🧹 The recent drop in returns may be worth investigating — is it due to improvement, or just fewer orders?

"While defect rates give us a glimpse into product quality, the real customer impact is better captured by looking at what gets returned — and when."

To understand how customer dissatisfaction evolves over time, I looked at monthly return volumes:

monthly_returns = df[df['Return_Date'].notnull()].groupby('Return_Month').size().reset_index(name='Returns')
monthly_returns['Return_Month'] = monthly_returns['Return_Month'].dt.to_timestamp()
monthly_returns

| |
Return_Month | Returns | | --- | --- | --- | | 0 | 2024-05-01 | 1 | | 1 | 2024-06-01 | 118 | | 2 | 2024-07-01 | 210 | | 3 | 2024-08-01 | 215 | | 4 | 2024-09-01 | 211 | | 5 | 2024-10-01 | 205 | | 6 | 2024-11-01 | 185 | | 7 | 2024-12-01 | 216 | | 8 | 2025-01-01 | 195 | | 9 | 2025-02-01 | 199 | | 10 | 2025-03-01 | 212 | | 11 | 2025-04-01 | 196 | | 12 | 2025-05-01 | 194 | | 13 | 2025-06-01 | 68 |

Monthly Product Returns- Whats Going Back and When?

🔼 Initial Return Surge (June–August 2024)

📦 In May 2024, returns were almost nonexistent (1 unit) — expected due to low initial sales.
From June to August, returns skyrocketed:
- June: 118
- July: 210
- August: 215
This surge directly followed the massive increase in sales in June — indicating a natural lag between purchases and returns.

📊 Stabilization Period (Sep 2024 – Apr 2025)

Returns stayed fairly stable, ranging from 185 to 216 units/month.
Despite fluctuations in sales volume, the number of returns held within a tight and predictable band, suggesting:
- A steady return rate
- Or that issues like defects and late delivery remained constant

📉 Notable Drop (May–June 2025)

Returns dipped slightly in May 2025 (194 units) — but then collapsed in June to just 68 units.
This sudden drop could be due to:
- 📉 Lower sales volume
- 🛠️ Quality or delivery improvements
- 📦 Or simply data cutoff (if the month is incomplete)

🧠 What I Take From This

The return pattern lags behind sales, peaking a month or two after spikes in orders.
Returns are consistent and high, pointing to chronic quality or delivery issues, not one-off events.
The June 2025 drop is worth investigating — if it’s a sign of real improvement, that’s a big win. If not, we may just be seeing lower order volumes.

🌍 Customer Segmentation – Who’s Driving Business (and Complaints)?

After exploring how products perform month-by-month, I wanted to dig deeper into who’s buying, where they’re buying from, and how happy they really are. So I grouped the data by Customer Segment and Region to analyze:

💰 Total revenue generated
🧯 Average defect rates

grouped=df.groupby(['Customer_Segment', 'Customer_Region']).agg({
    'Revenue_Generated': 'sum',
    'Defect_Rate': 'mean',
    'Return_Reason': 'count'
}).reset_index()

fig1=px.bar(grouped,x='Customer_Segment',
            y='Revenue_Generated',
            color='Customer_Region',
            barmode='group',
            title='Revenue by Customer Segment and Region')
fig1.show()

💰 Revenue by Segment and Region (Bar Plot)

Regular segments dominated in every region.
Most of the Revenue was generated by North region.

fig3 = px.bar(grouped, 
              x='Customer_Segment', y='Defect_Rate', 
              color='Customer_Region', barmode='group',
              title='⚠️ Avg Defect Rate by Customer Segment and Region')
fig3.show()

Here it can be observed that the defect rate is nearly same of every customer segment and customer region.

📉 Decomposing Returns: Trend, Seasonality & What Lies Beneath

To prepare for forecasting and to better understand the dynamics behind customer returns, I applied time series decomposition to our monthly returns data. This technique breaks the time series into three components:

Trend (long-term direction)
Seasonality (cyclical repeating patterns)
Residuals (random noise or unexplained variations)

from statsmodels.tsa.seasonal import seasonal_decompose
import plotly.graph_objects as go

# Convert date columns to datetime
df['Order_Date'] = pd.to_datetime(df['Order_Date'])
df['Return_Date'] = pd.to_datetime(df['Return_Date'])

# Create monthly aggregations
sales_monthly = df.groupby(pd.Grouper(key='Order_Date', freq='M'))['Number_Products_Sold'].sum()
returns_monthly = df[df['Return_Date'].notnull()]\
                    .groupby(pd.Grouper(key='Return_Date', freq='M')).size()

# Perform decompositions
decomposed_sales = seasonal_decompose(sales_monthly, model='additive', period=6)
decomposed_returns = seasonal_decompose(returns_monthly, model='additive', period=6)

# Plot 1: SALES Decomposition (corrected)
fig_sales = go.Figure()

fig_sales.add_trace(go.Scatter(x=sales_monthly.index, y=decomposed_sales.trend,
                         mode='lines', name='Trend', line=dict(color='blue')))
fig_sales.add_trace(go.Scatter(x=sales_monthly.index, y=decomposed_sales.seasonal,
                         mode='lines', name='Seasonality', line=dict(color='orange')))
fig_sales.add_trace(go.Scatter(x=sales_monthly.index, y=decomposed_sales.resid,
                         mode='lines', name='Residuals', line=dict(color='red')))

fig_sales.update_layout(
    title='📈 Sales Decomposition: Trend, Seasonality & Residuals',
    xaxis_title='Month',
    yaxis_title='Sales Components',
    legend=dict(x=0.01, y=0.99)
)

fig_sales.show()

📈 Decomposing Sales: Trend, Seasonality & Residuals

To better understand how our sales evolve over time — and to prep for accurate forecasting — I performed a time series decomposition on our monthly sales data. This broke it down into:

Trend – the long-term direction of sales
Seasonality – regular patterns repeating each cycle
Residuals – random or unexplained variations

📊 Trend – A Steady Mid-Range Flow

The trend line shows a relatively stable plateau from September 2024 to March 2025, following an earlier spike.
After the initial ramp-up in June 2024, the system seems to have entered a steady operational phase, with no extreme upward or downward drift in monthly sales.

🔁 Seasonality – Sales Cycles Are Real

I noticed distinct seasonal peaks in July and December–January — likely driven by festive promotions, holiday shopping, and new launches.
In contrast, August–September and February–March showed seasonal dips, which could point to slower demand or post-holiday cooldowns.
These patterns are critical for planning promotions, inventory, and staffing ahead of time.

🔍 Residuals – The Unexplained Wiggles

For most of the timeline, residuals are close to zero, meaning the model captures sales behavior quite well.
However, in March 2025, there’s a notable negative residual — actual sales were lower than expected, even after adjusting for trend and seasonality.
This might reflect a drop in consumer sentiment, delayed marketing, or external disruptions like holidays or logistical slowdowns.

🧠 What I Learned

This decomposition clarified that our sales system, while initially volatile, has now settled into a predictable rhythm — with strong seasonal surges and manageable fluctuations.
These insights will help strengthen future forecasting models like Prophet or ARIMA.
Most importantly, it gives us the confidence to align supply chain capacity and marketing efforts with real seasonal cycles — not just gut instinct.

fig = go.Figure()

fig.add_trace(go.Scatter(x=returns_monthly.index, y=decomposed_returns.trend,
                         mode='lines', name='Trend', line=dict(color='blue')))
fig.add_trace(go.Scatter(x=returns_monthly.index, y=decomposed_returns.seasonal,
                         mode='lines', name='Seasonality', line=dict(color='orange')))
fig.add_trace(go.Scatter(x=returns_monthly.index, y=decomposed_returns.resid,
                         mode='lines', name='Residuals', line=dict(color='red')))

fig.update_layout(

    title='📉 Returns Decomposition: Trend, Seasonality & Residuals',
    xaxis_title='Month',
    yaxis_title='Returns Components',
    legend=dict(x=0.01, y=0.99)
)

fig.show()

Key Observations from Returns Decomposition Analysis

📊 Trend Analysis: Stability Over Time

Flat trajectory: Returns trend line stays remarkably stable (~200 units) from Sep 2024 to Mar 2025
No alarming growth: Unlike many scaling businesses, return volumes aren't exponentially increasing
Operational insight: Return rates appear controlled, not spiraling due to poor quality management
Strategic takeaway: Focus should be on seasonal planning rather than crisis management

🔄 Seasonality: Holiday Return Surge

Peak return periods: Strong positive spikes in July and December-January
Post-holiday effect: December-January surge aligns with typical gift return behavior
Summer spike: July peak likely correlates with back-to-school shopping returns
Low return months: August-September and February-March show below-average return volumes
Planning advantage: Predictable patterns enable better resource allocation for customer service and logistics

⚡ Residuals: Unexpected Events

March 2025 anomaly: Sharp negative residual indicates returns dropped far below expected levels
Model accuracy: Most residuals near zero show trend + seasonality explains return patterns well
Investigation opportunity: March drop could signal quality improvements, policy changes, or reduced sales
Forecasting reliability: Low residual variance means future return predictions will be more accurate

🎯 Business Impact & Action Items

Resource planning: Staff customer service teams higher during July and December-January
Inventory management: Prepare for increased return processing during peak months
Quality investigation: Deep-dive into March 2025 factors that caused unexpected return drop
Forecasting confidence: Use decomposition insights to build more accurate return volume predictions
Cost optimization: Allocate return-handling budgets based on seasonal patterns rather than flat monthly estimates

🕒 Lead Time Component Analysis – What's Slowing Us Down?

To identify where delays actually occur in the delivery pipeline, I broke down the Total Lead Time into its two known parts:

Manufacturing Lead Time
Shipping Time

Plotting monthly trends over the year gave me some really clear insights:

df['Order_Month'] = df['Order_Date'].dt.to_period('M').dt.to_timestamp()
lead_time_components = df.groupby('Order_Month')[['Lead_Time', 'Manufacturing_Lead_Time', 'Shipping_Time']].mean().reset_index()
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=lead_time_components['Order_Month'], y=lead_time_components['Lead_Time'],
    mode='lines+markers', name='Total Lead Time', line=dict(color='black')
))
fig.add_trace(go.Scatter(
    x=lead_time_components['Order_Month'], y=lead_time_components['Manufacturing_Lead_Time'],
    mode='lines+markers', name='Manufacturing Lead Time', line=dict(color='blue')
))
fig.add_trace(go.Scatter(
    x=lead_time_components['Order_Month'], y=lead_time_components['Shipping_Time'],
    mode='lines+markers', name='Shipping Time', line=dict(color='green')
))

fig.update_layout(
    title='📦 Lead Time Component Trends',
    xaxis_title='Month',
    yaxis_title='Average Days',
    legend=dict(x=1, y=1),
    hovermode='x unified'
)

fig.show()

📌 Remarkable Stability Across All Lead Times

What immediately stood out was how flat all the lines were.
Month after month, Manufacturing, Shipping, and Total Lead Time barely budged.
This level of consistency suggests that, operationally, both production and shipping are highly standardized — or at the very least, predictably rigid.

🧮 Wait… What’s Missing From Total Lead Time?

Here's the math that caught my eye:
- Total Lead Time ~ 26 days
- Manufacturing ~ 6.7 days
- Shipping ~ 4.8 days
- That adds up to just ~11.5 days — so… what’s the rest of the ~14.5 days?

💡 This unexplained gap might include:

Internal order processing delays
Waiting time for raw materials or approvals
Quality checks
Warehouse queueing or documentation delays

🔍 What I’m Taking Away

The bottleneck clearly isn’t in manufacturing or logistics — they’re too stable.
The real lag likely lies in invisible or untracked operations sitting before or after the manufacturing and shipping stages.
That’s a clear signal to investigate internal workflows, supplier readiness, or even IT system integration delays that could be stretching overall delivery time.

🎛️ Simulation: Setting the Stage with Clean Monthly Data

With all exploratory insights done, I shifted gears and started prepping for what-if simulations.

First, I ran my custom data_loader.py script to:

✅ Clean the raw dataset
📊 Aggregate monthly sales, returns, and defect rates
🔎 Structure the data by Order_Month and Product_Type — just how a real ops team would model KPIs for planning

Here’s a sample from the output:

Order_Month	Product_Type	Sales	Returns	Avg_Defect_Rate
2024-05	Headphones	337	3	3.37%
2024-05	Laptop	730	7	2.72%
2024-05	Smartphone	1347	7	2.83%

This monthly product breakdown is exactly what I need as input for simulation scenarios like:

Demand spikes 📈
High-defect supplier batch 📦
Policy tweaks (e.g., refund caps, return window limits)

🔮 When Algorithms Try to Predict the Future: A Tale of 5 Products

Or: How I learned to stop worrying and love forecasting chaos

So I just finished running a forecasting engine on our historical sales data in forcasting_models.py, and honestly? It felt less like data science and more like five fortune tellers arguing over tea leaves.

Here’s what happened when I turned algorithms loose on five wildly different product categories:

🎯 How I Picked the “Winner” Algorithm (a.k.a. My Forecasting Fight Club)

Okay, so I had five products. I had five algorithms. And I needed to figure out which model to trust with my reputation (and maybe someone’s annual budget 🫣).

Naturally, I did what any responsible data nerd would do: set up a battle royale and let them fight it out.

⚖️ The Weapon of Choice: MAE (Mean Absolute Error)

MAE is my ride-or-die metric because it answers the question:

"Yo, on average, how wrong are we?"

Formula (for the math nerds):
MAE = Average(|Actual - Predicted|)

It’s like grading a bunch of darts — you don’t care where they hit, you just care how far off they landed from the bullseye. 🎯

🧪 Let the Tournament Begin (Time Series Cross-Validation Style)

Each model was tested like this:

Train on Jan–June → Predict July
Train on Jan–July → Predict August
Train on Jan–August → Predict September
... rinse and repeat.

And whoever had the lowest average “miss” (MAE) across all rounds won the title belt for that product.

🏁 And the Results Are In...

Product	🥇 Winner	💥 MAE Score	🧠 Why It Won
🎧 Headphones	Naive	68,924	Even chaos can’t beat dumb luck
💻 Laptop	Prophet	852,695	Loves clean, seasonal data
📱 Smartphone	Moving Average	1,288,991	Smoothed the mess like a champ
⌚ Smartwatch	Moving Average	60,653	Chill and accurate — like a pro
📱 Tablet	Moving Average	96,418	Mr. Consistent, no drama

🤔 So... Why These Models?

🧼 Moving Average: The Calm in the Storm (3 Wins)

What it does: "Let’s just average recent stuff and chill"
Why it worked: Because when your data’s throwing tantrums, being calm wins

🧃 Naive: The "Shrug Emoji" That Nailed Headphones

What it does: "Tomorrow = today"
Why it worked: Headphones data was basically chaos with a headset on. Sometimes, guessing blindly is your best bet.

🔮 Prophet: The Fancy Forecasting Intern Who Nailed Laptops

What it does: Finds trends + seasonality and pretends to be psychic
Why it worked: Laptops are chill. The data was neat and predictable. Prophet got to shine like it was presenting in a clean PowerPoint deck.

🧠 Real Talk: What I Learned

1. Simple Models Win Fights

Moving Average won 3 out of 5 times. The model equivalent of "just keep breathing" beat out the complex ones. Respect.

2. Volatile Data = Rage Quits for Algorithms

Headphones had a data spike that looked like a rocket launch. Algorithms saw that and went, "Nope." Naive method just nodded and kept going.

3. Prophet’s a Control Freak

It only worked when everything was predictable. If your data so much as sneezed, Prophet walked out the door.

💡 If You’re Doing This Too, Here’s the Dev Advice:

Use multiple models – because you never know which one will "get" your product
MAE is your buddy – clean, stable, doesn’t overreact to weird spikes
Don't obsess over complexity – sometimes the best forecast is just “the recent average”
Retrain your models – product behavior changes like code in a hackathon

TL;DR – My Forecasting Fight Club Rules 🥋

Each product is a different beast — no single model rules them all
Don’t worship complexity — even the Naive method had its moment
Let the data guide you — not your gut, not Stack Overflow, not some blog from 2012
Watch the MAE — it’ll tell you who’s messing up the least

P.S. If anyone asks why I didn’t use Deep Learning here... tell them because this wasn’t a Kaggle contest — it was real-world, messy, unfiltered, business-critical forecasting. And in the real world? Simplicity is king. 👑

📊 The Good, The Bad, and the “Wait, What?”

🎧 Headphones: The Rollercoaster That Broke Everyone's Brain

What Happened: Sales shot up to nearly 950K units in late 2024... then nosedived like my code on a Friday.
Algorithm Winner: Naive — yes, the “just repeat the last value” method.
Forecast: Every model basically shrugged and gave me a different story.

💡 My Take: When your data looks like a heart monitor in a horror movie, even the smartest models just tap out. Sometimes chaos is... just chaos.

💻 Laptops: The Steady Eddie

What Happened: Stable between 10–13 million units, with clean seasonal humps.
Algorithm Winner: Prophet (yes, Meta’s time series library pulled its weight here).
Forecast: Smooth upward trajectory, almost textbook.

💡 My Take: The model had it easy here — this is the dream product line for forecasting engines. Prophet probably enjoyed the vacation.

📱 Smartphones: The Plot Twist Master

What Happened: Two major peaks (~17M units), dramatic valleys in between.
Algorithm Winner: Moving Average — slow and steady won the race.
Forecast: Cautious optimism, trending gently upward.

💡 My Take: Sometimes all you need is the humble average. Fancy doesn’t always mean better — stability wins when the past is erratic.

⌚ Smartwatches: The Underdog Story

What Happened: Hovered around 1 million units, but threw in just enough drama.
Algorithm Winner: Moving Average (again!)
Forecast: Moderate growth... but the Trend model went full "TO THE MOON!" 🚀

💡 My Take: That brown Trend line? It’s your hype friend who's always yelling “This is our year!” while everyone else is cautiously optimistic.

📱 Tablets: The Seasonal Performer

What Happened: Peaked at 1.7 million, followed a bumpy but rhythmic cycle.
Algorithm Winner: Moving Average (MVP of the day).
Forecast: Conservative and calm... unless you ask the Trend model, which again dreams big.

💡 My Take: Tablets are behaving well, but they’ve got some subtle quirks. Like that coworker who's quiet but secretly brilliant..

🎯 What I Learned (Without the Buzzwords)

✅ 1. Simple Often Wins

Three out of five winners were just moving averages. Forget deep learning — sometimes you just need the mean of recent points.

⚠️ 2. Volatile Data = Forecasting Nightmare

If your history is all over the place (hello, headphones), don’t expect magic. Forecasting isn’t psychic — it’s math.

📈 3. Trend Models Are Hopeless Optimists

That brown line shooting to the sky? The Trend model. Always confident, sometimes delusional.

😅 4. Prophet Has Trust Issues

It only performed well on the most stable product — laptops. It’s like that friend who only speaks up when they know they’re right.

🤖 What This Means for Devs Like Us

Don't Overcomplicate
You don’t always need Prophet, LSTMs, or ensemble stacks. Sometimes "average last few values" just works.
Clean Data > Clever Models
Garbage in, garbage out. Focus on data quality before algorithm flexing.
Run Multiple Models
It's not about finding the answer — it's about seeing the range of possibilities. Let your stakeholders choose the risk appetite.
Be Honest About Uncertainty
Forecasts diverge for a reason. Communicate that. The future is messy — and that’s okay.

💡 Final Word

Forecasting is like weather prediction — you're not aiming to be perfectly right, you're trying to be usefully wrong.

Whether it’s moving average or Prophet, the point is this: give the business just enough foresight to make better decisions, not perfect ones.

Also — always bet on the moving average. It’s the "try turning it off and on again" of forecasting. Unsexy. Underrated. Unreasonably effective.

P.S. – If someone asks why the forecast lines diverge, just say you’re “capturing the full range of uncertainty in market dynamics.” That line has saved me more than once. 😉

🧠 Simulation Phase: Predicting, Stress-Testing & Learning from Returns

Before jumping into conclusions, it’s essential to help readers understand what we did, how we did it, and why it matters. Here's a breakdown of the simulation workflow and what each step teaches us.

🛠️ What Was Simulated and Why?

In the returns-heavy world of consumer electronics, operational costs, customer satisfaction, and risk mitigation depend on how well we can forecast demand, predict returns, and stress-test our system under different scenarios.

So I built an enhanced Supply Chain Simulation Engine using real-world monthly product-level data. The goal? Go beyond EDA and forecasting—run simulations to uncover:

🔁 Expected return costs
🧪 High-risk operational periods
⚠️ SLA (Service Level Agreement) breaches
📉 Satisfaction issues
💸 Worst-case financial risk using probabilistic analysis

🔍 Step-by-Step: What Happens in the Simulation Engine?

1. 📦 Data Aggregation & Preprocessing

Raw transactional records were cleaned, standardized, and grouped into monthly summaries for each product. Key metrics:

Monthly demand
Number of returns
Defect rates
SLA compliance
Lead times
Satisfaction scores

2. 📊 Forecasting Demand

For each product (e.g., Smartphones, Laptops), I built demand forecasting models using historical sales data. These were validated using:

R² (R-squared) – How well the model explains past variation
MAPE – Mean Absolute Percentage Error, to measure accuracy

✅ All models achieved excellent accuracy (R² ≥ 0.96, MAPE under 8%), making them reliable for simulation inputs.

3. 🧮 Estimating Return Costs

Each simulated demand period was evaluated to calculate:

Defects (based on observed defect rates)
SLA breaches (based on lead times vs. targets)
Resulting return costs (₹)

This helps assess how operational quality impacts financial performance.

4. 🚦 Risk Indicators

For each product, the simulation flags:

High Risk Periods – Months with above-threshold return costs
SLA Breaches – Times when service deadlines are missed
Customer Satisfaction Dips – Simulated based on defects & SLA delays

5. 🎲 Monte Carlo Simulation

Now, the fun part. Using the previously calculated average return cost, we run 10,000+ random trials with slightly varied inputs (like demand, defects, delays). This shows us:

What if demand unexpectedly spikes?
What if defects worsen?
How bad can our losses get?

This technique is called the Monte Carlo Method.

🎲 What is Monte Carlo Simulation?

It’s a statistical technique used to estimate uncertain outcomes by running simulations over and over with random variations.

🔍 Why it’s used:

Real-world operations are full of uncertainty (e.g., sudden demand surge, shipping delays).
Instead of relying on a single "average" prediction, Monte Carlo gives a distribution of possible outcomes.
Helps answer: “What’s the worst-case cost we should prepare for?”

💡 Example: For Smartphones, while the expected return cost was ₹35.6M, Monte Carlo shows it could go up to ₹58.7M (95th percentile). That’s critical info for financial planning.

🚀 Starting Enhanced Supply Chain Simulation...
🔍 Preparing data...
✅ Data prepared: (65, 4) records, 5 products
⚙️  Running simulation...
🔍 Simulating: Headphones
🔍 Simulating: Laptop
🔍 Simulating: Smartphone
🔍 Simulating: Smartwatch
🔍 Simulating: Tablet
🎯 Validating model...
📊 Creating visualizations...

🔍 Forecasting Performance – What Worked & What Didn’t

🎧 Headphones – Spikey & Unpredictable

🔺 Festival demand overshot reality (model predicted 7.5K+ units vs. actual ~5K)
❗ Lesson: My multipliers may be too aggressive for low-commitment purchases like headphones.
🧊 Risk: Overstocking during Q4 → Inventory carrying costs.

💻 Laptops – Surprisingly Predictable

✅ Best alignment between real & simulated demand.
🔁 Steady corporate cycle + fewer promotions make it easier to model.
🧠 Insight: This is the "gold standard" of how forecasting should behave.

📱 Smartphones – Model Overshoots, Market Shrugs

🔄 Forecast was wild, actual demand was chill.
📉 Market is saturated. Promotions barely move the needle now.
🧭 Action: Tune down sensitivity to events; focus more on baseline sales + replacements.

⌚ Smartwatches – Emerging but Solid

👍 Forecasting aligned well with real sales
📈 Shows manageable growth with minor spikes
🚀 Ready for smarter inventory strategies

🔥 Return Cost Risk Zones – Where the Trouble Hits

🗓️ Q4 2024: The Danger Zone

🎯 Highest return costs in Smartphones (₹10M+) and Laptops
🔁 Direct correlation between demand surges, defect multipliers, and return cost spikes
🔍 Model validation: Risk periods perfectly matched real-life issues

📊 Category-Wise Risk Summary

Product	Risk Level	Risk Drivers
Smartphones	🚨 High	High unit price + frequent SLA breaches
Laptops	⚠️ Medium-High	Predictable but expensive to return
Headphones	🟡 Moderate	High spikes, but cheaper per unit
Smartwatches	🟢 Low	Fewer returns, better quality consistency
Tablets	🟢 Low	Quiet, stable, nothing dramatic

Supply Chain Simulation Report

After stress-testing my supply chain using enhanced Monte Carlo simulations across five product categories, I uncovered some powerful patterns — both from a data science perspective and a business strategy lens.

This isn’t just another dashboard; it’s a quantitative reality check on what could go wrong (and how bad it might get).

⚙️ Tech Implementation Insights

✅ Model Performance Was 🔥

R² Scores: Every model scored above 0.92, with Smartwatches peaking at 0.988
MAPE (Mean Absolute Percentage Error): Stayed tight — 7.0% (Smartwatches) to 16.8% (Smartphones)
Why It Matters: These are strong results, confirming that my feature engineering, seasonal adjustments, and lead time handling were on point.

🎲 Monte Carlo Distribution Patterns

Monte Carlo helped me model not just “what’s likely,” but also “what’s possible” — especially worst-case risks.

📈 Distribution Shapes by Product:

Headphones & Smartwatches: Clean, tight bell curves — low volatility
Laptops & Smartphones: Long right tails — fat risk zones for extreme outcomes
Tablets: Weird one. Bimodal distribution shows dual cost behavior (maybe two buyer segments or data quality artifacts?)

📊 Volatility Snapshot: (95th Percentile vs Mean)

Product	Ratio	Volatility Tag
Tablet	1.67x	🚨 Highest
Headphones	1.64x	Moderate
Smartphone	1.64x	High
Smartwatch	1.62x	Low-Moderate
Laptop	1.60x	Lowest (Stable-ish)

💼 Business Risk Insights

💸 Cost Exposure Breakdown

Smartphones: ₹57.9M at 95th percentile → 📍 Top financial risk zone
Laptops: ₹33.1M → High impact, especially due to unit price
Headphones / Smartwatches / Tablets: Lower return costs, but still matter due to volume-driven spikes

🚨 Operational Red Flags Across the Board

Customer Satisfaction: All products rated below 2.50/10 🤯
SLA Breaches: Exactly 13 per product → smells like either a pattern or a pipeline bug
High-Risk Periods: 12–13 per product → possibly linked to Q4 festival pressure

👨‍💻 Developer POV: What I Learned (And What Broke)

🔎 1. Data Quality Smells

That suspicious “13 SLA breaches” across all products?

Could be a logging glitch
Or a shared system bottleneck (e.g., logistics API fails)
Either way, worth auditing the data pipeline

🧱 2. Simulation Architecture Held Up Well

Monte Carlo FTW: Gave me a reliable risk band across 1000s of simulations
Scalability: Simulated 459,646 demand events in a few minutes
Cross-Category Consistency: Smartwatches and Smartphones had very different demand profiles, but the model handled both smoothly

⚠️ 3. Technical Debt = Business Risk

Satisfaction Crisis: Below 2.5 across all products is a red flag. This isn’t just UX — it bleeds into returns.
Defect Rates: Hovering near 6% — unsustainable at scale
Financial Exposure: ₹95.7M in potential return costs → not theoretical. This will hurt if ignored.

Finally!! Monte Carlo Analysis

After simulating 459,646 transactions across 5 product lines using Monte Carlo methods, here’s what kept me up at night:

😱 Customer satisfaction scores: 2.15 – 2.50 out of 10
💸 Total potential return losses: ₹95.7 million
🛑 SLA breaches: Consistently 13 for every product (sus 🤨)

📱 Smartphones: The Drama Queen

Risk Range: ₹35.2M → ₹57.9M

“Like that chill friend who occasionally sets the kitchen on fire.”

Fat-tailed distribution — big potential for catastrophic losses
Highest financial exposure of all products
Prediction accuracy is solid — but volatility makes it hard to act on

🔁 Real Talk: Needs active risk caps + real-time alerts. Like, now.

💻 Laptops: The Steady Worrier

Risk Range: ₹20.7M → ₹33.1M

“Predictable but still makes you sweat.”

Distribution is tighter than phones, but not risk-free
1.6x jump from mean to 95th percentile feels industry-typical
Stable-ish R² and MAPE support trust in forecasts

🔁 Real Talk: Manageable, but needs controlled lead time windows.

🎧 Headphones: The People Pleaser

Risk Range: ₹9.2M → ₹15.1M

“Like the product that actually read the statistics manual.”

Textbook bell curve — most predictable product
Financially lowest risk among the big-volume categories

🔁 Real Talk: If I had to bet my bonus on anything, it’s headphones.

⌚ Smartwatches: The Overachiever

Risk Range: ₹6.0M → ₹9.7M

“The only product that doesn’t make me question my life choices.”

R² = 0.988 😮 — top-tier model performance
Tight clustering and accurate forecasts
Still has customer satisfaction issues (why though?)

🔁 Real Talk: This is how supply chains should behave.

📱 Tablets: The Wildcard

Risk Range: ₹5.5M → ₹9.2M

“Quiet colleague who occasionally drops wisdom bombs.”

Weird bimodal distributions — possibly missed seasonal patterns
Unstable forecasts, though not a huge cost driver

🔁 Real Talk: Might need segmentation or a better demand model.

🚨 The “Oh Crap” Moments

1. SLA Glitch or Cosmic Joke?

13 SLA breaches. On every product. Exactly.
🔧 Either my monitoring system is broken… or cursed.

2. Customers Are Actively Mad

Satisfaction scores below 2.5/10?
🧨 That’s not just bad — that’s refund-triggering-bad.

3. We’re Awesome at Predicting Pain

R² scores above 0.92 across the board.
👏 Great job, models — now can we stop bleeding money?

💰 What I'd Do If This Were My Money

🔧 Quick Fixes

🔍 Fix monitoring logic – The 13 SLA breaches scream “code bug”
📬 Build a customer feedback loop – Why do people hate our stuff?
📦 Double down on smartwatches – Use it as a template for other lines

⏳ Medium-Term Fixes

📉 Cap smartphone risk – Set quarterly exposure ceilings
🛡 Add quality gates – 6% defect rate = not okay
📊 Use live dashboards – So we can watch the money burn in HD

🌐 Long-Term Strategy

🧱 Build resilience – Fat tails = fragile supply chain
🔄 Add preventive modeling – Predict and prevent losses
💔 Revamp customer experience – 2.5/10 isn’t just a metric — it’s a warning sign

🎲 Why Monte Carlo?

Because one forecast is not enough.
Monte Carlo gives you 10,000 realities, including the one where you lose ₹57 million.

📌 Final Word

Monte Carlo isn’t about making pretty graphs — it’s about asking:
“What happens if everything breaks at once?”

Now that I’ve seen the risk, the only question is:
Do we act on it — or wait until it actually happens?

🧵 Wrapping It Up: Not a Nerd, Just Curious (And Sleep-Deprived) 😵‍💫

Okay so listen… this whole simulation + forecasting + blog post marathon?

Started as: “Hmm this looks fun, let me try forecasting return costs 😎”
Turned into: “Why is my laptop overheating? Why is Prophet hallucinating? Why is customer satisfaction 2.1???”
Ended as: “Bro I haven't replied to my girlfriend in 6 hours. I'm fighting charts instead of fighting for my relationship 💔”

Somewhere between debugging Naive forecasts and arguing with my matplotlib legend placement, I realized:
I might have gone too deep 🕳️

🫠 Real Talk: I'm Just a College Guy Doing My Best

I’m not claiming to be some analytics expert or supply chain wizard.
This is just me, trying to make sense of chaos with Python, a couple CSVs, and dangerously low sleep.

If you're an experienced dev/data person reading this:

Please don’t judge me too hard 😂
If you see something dumb — help fix it, don’t flame it
If it made you smile or think — drop a comment or DM, I’d love to hear from you

📬 Slide into the Comments Like:

“Yo Raj, here’s a better way to handle the defect multiplier”
“Bro, your forecast logic hurt me but here’s how to fix it”
“This is exactly how my semester project spiraled too 💀”

Final Thought 🎯

If there’s one thing this taught me, it’s this:

You don’t need to be a genius to simulate risk. You just need some curiosity, coffee, and commitment issues.

Thanks for reading. And if my girlfriend is reading this:
I’m sorry baby, it was just one more simulation... then 10,000 more Monte Carlos. 😭

https://www.linkedin.com/in/raj-lucky-7215aa259/ (My LinkedIn if anyone wanna connect obv no one would after reading this much).

raj376457@gmail.com
(Emails with code screenshots, bug fixes, or "bruh wtf is this" also welcome.)

Peace ✌️
— Raj, Still Debugging IRL

Subscribe to my newsletter

Read articles from Raj Lucky directly inside your inbox. Subscribe to the newsletter, and don't miss out.

data analysis supply chain returns analytics

Written by