E-commerce Yearly Sales Prediction: App vs Website Focus


A Machine Learning Approach using Linear Regression
Introduction
With the growing digital economy, understanding where to focus β mobile app or website β is a critical question for e-commerce businesses. In this project, I used a Kaggle dataset that compares mobile app vs website sales performance and applied Linear Regression to predict Yearly Sales Revenue.
π Dataset link: Kaggle - Focusing on Mobile App or Website
π Dataset Overview
The dataset provides insights like:
Customer count
Time spent on app vs website
Length of membership
Session frequency
Yearly Amount Spent
Our goal was to predict the Yearly Amount Spent
using other behavioral features.
π οΈ Tools & Libraries Used
pythonCopyEditimport pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import math
import pylab
import scipy.stats as stats
π Exploratory Data Analysis (EDA)
To start, I used Seaborn and Matplotlib to visualize relationships:
Heatmap to view feature correlation
Jointplots to explore how
Time on App
andTime on Website
impact yearly spendingPairplots to get a holistic view of data distribution
This revealed strong correlation between Length of Membership
and Yearly Amount Spent
.
βοΈ Model Building
I trained a Linear Regression model to predict Yearly Amount Spent
from the other features.
π Steps:
Data cleaning & null check
Splitting into train/test
Model training using
LinearRegression()
fromsklearn
pythonCopyEditX = df[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]
y = df['Yearly Amount Spent']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
π Model Evaluation
After training, I evaluated using:
pythonCopyEdity_pred = model.predict(X_test)
mae = metrics.mean_absolute_error(y_test, y_pred)
mse = metrics.mean_squared_error(y_test, y_pred)
rmse = math.sqrt(mse)
π Results:
Mean Absolute Error (MAE):
7.23
Mean Squared Error (MSE):
79.81
Root Mean Squared Error (RMSE):
8.93
π‘ Key Learnings
Length of Membership was the most important feature, highlighting customer retention's importance.
Time on App had a stronger relationship with yearly spending than Time on Website, suggesting companies should consider focusing more on mobile platforms.
Even a simple Linear Regression model can provide powerful insights when paired with good feature understanding.β Conclusion
This project showed how regression can be used not just for prediction but for strategic decision-makingβlike whether to invest more in mobile or web. Iβm excited to keep experimenting and building smarter ML solutions!
Demo Vedio Link
Subscribe to my newsletter
Read articles from Lokesh Patidar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Lokesh Patidar
Lokesh Patidar
Hey, I'm Lokesh Patidar! I'm a 2nd-year student at SATI Vidisha, passionate about AI, Machine Learning, Full-Stack Development , and DSA. What I'm Learning: Currently Exploring Machine Learning π€ Completed DSA & Frontend Development π Now exploring Backend Development π‘ Interests: I love solving problems, building projects, and integrating AI into real-world applications. Excited to contribute to tech communities and share my learning journey! π Follow my blog for insights on AI, ML, and Full-Stack projects!