E-commerce Yearly Sales Prediction: App vs Website Focus

Lokesh PatidarLokesh Patidar
2 min read

A Machine Learning Approach using Linear Regression


Introduction

With the growing digital economy, understanding where to focus β€” mobile app or website β€” is a critical question for e-commerce businesses. In this project, I used a Kaggle dataset that compares mobile app vs website sales performance and applied Linear Regression to predict Yearly Sales Revenue.

πŸ”— Dataset link: Kaggle - Focusing on Mobile App or Website


πŸ“ Dataset Overview

The dataset provides insights like:

  • Customer count

  • Time spent on app vs website

  • Length of membership

  • Session frequency

  • Yearly Amount Spent

Our goal was to predict the Yearly Amount Spent using other behavioral features.


πŸ› οΈ Tools & Libraries Used

pythonCopyEditimport pandas as pd  
import numpy as np  
import matplotlib.pyplot as plt  
import seaborn as sns  
from sklearn.model_selection import train_test_split  
from sklearn.linear_model import LinearRegression  
from sklearn import metrics  
import math  
import pylab  
import scipy.stats as stats

πŸ“Š Exploratory Data Analysis (EDA)

To start, I used Seaborn and Matplotlib to visualize relationships:

  • Heatmap to view feature correlation

  • Jointplots to explore how Time on App and Time on Website impact yearly spending

  • Pairplots to get a holistic view of data distribution

This revealed strong correlation between Length of Membership and Yearly Amount Spent.


βš™οΈ Model Building

I trained a Linear Regression model to predict Yearly Amount Spent from the other features.

πŸ” Steps:

  1. Data cleaning & null check

  2. Splitting into train/test

  3. Model training using LinearRegression() from sklearn

pythonCopyEditX = df[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]  
y = df['Yearly Amount Spent']  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  
model = LinearRegression()  
model.fit(X_train, y_train)

πŸ“ˆ Model Evaluation

After training, I evaluated using:

pythonCopyEdity_pred = model.predict(X_test)  
mae = metrics.mean_absolute_error(y_test, y_pred)  
mse = metrics.mean_squared_error(y_test, y_pred)  
rmse = math.sqrt(mse)

πŸ“‰ Results:

  • Mean Absolute Error (MAE): 7.23

  • Mean Squared Error (MSE): 79.81

  • Root Mean Squared Error (RMSE): 8.93


πŸ’‘ Key Learnings

  • Length of Membership was the most important feature, highlighting customer retention's importance.

  • Time on App had a stronger relationship with yearly spending than Time on Website, suggesting companies should consider focusing more on mobile platforms.

  • Even a simple Linear Regression model can provide powerful insights when paired with good feature understanding.βœ… Conclusion

This project showed how regression can be used not just for prediction but for strategic decision-makingβ€”like whether to invest more in mobile or web. I’m excited to keep experimenting and building smarter ML solutions!

Linkedin_Post

0
Subscribe to my newsletter

Read articles from Lokesh Patidar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Lokesh Patidar
Lokesh Patidar

Hey, I'm Lokesh Patidar! I'm a 2nd-year student at SATI Vidisha, passionate about AI, Machine Learning, Full-Stack Development , and DSA. What I'm Learning: Currently Exploring Machine Learning πŸ€– Completed DSA & Frontend Development 🌐 Now exploring Backend Development πŸ’‘ Interests: I love solving problems, building projects, and integrating AI into real-world applications. Excited to contribute to tech communities and share my learning journey! πŸ“Œ Follow my blog for insights on AI, ML, and Full-Stack projects!