Real-Life House Price Prediction with Linear Regression

RoemaiRoemai
2 min read

Predicting house prices is a key part of real estate analytics, and in this project, I’ll walk you through how I built a machine learning model using linear regression to predict house prices.

Project Overview

We start with a dataset that contains information such as house square footage, number of bedrooms, and location. Our task is to predict the house price using these features.

Steps Involved:

1. Data Preprocessing

The dataset contained missing values and categorical variables like location. I handled missing values by filling them with the mean, and converted the location feature into numerical values using one-hot encoding.

# Handle missing values and one-hot encode 'location'
df['square_footage'].fillna(df['square_footage'].mean(), inplace=True)
df['bedrooms'].fillna(df['bedrooms'].mean(), inplace=True)
df = pd.get_dummies(df, columns=['location'], drop_first=True)

2. Splitting Data

Next, I split the data into training and testing sets to evaluate the model's performance.

from sklearn.model_selection import train_test_split
X = df.drop('price', axis=1)
y = df['price']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

3. Model Training

Using linear regression, I trained the model to predict house prices based on the available features.

from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, y_train)

4. Evaluation

The model's performance was evaluated using Mean Squared Error (MSE), which gives an idea of how close the predicted prices are to the actual ones.

from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

5. Visualization

Here’s a plot showing actual vs predicted prices:

import matplotlib.pyplot as plt
plt.scatter(y_test, y_pred)
plt.xlabel('Actual Prices')
plt.ylabel('Predicted Prices')
plt.title('Actual vs Predicted House Prices')
plt.show()

Conclusion

This project demonstrates how linear regression can be used to predict house prices. Although the results are promising, further improvement could involve experimenting with other algorithms such as Random Forest or XGBoost for higher accuracy.

Check out the code on GitHub

0
Subscribe to my newsletter

Read articles from Roemai directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Roemai
Roemai

At Roemai we are empowering individuals through education, innovation, and technology solutions with robotics, embedded systems, and AI.