Housing Price Prediction
This project is based on Chapter 2 of the Hands-On Machine Learning book. The goal of the project is to build a housing price prediction model using the California housing dataset.
Project link: peeplika/Housing-price-prediction (github.com)
Step 1: Downloading the Data
The California housing dataset was used, which is a shortened version of the full dataset. This smaller dataset makes it easier to work with while still providing enough data for training the model.
Step 2: Splitting the Data
A portion of the data was set aside for testing before any analysis was performed. This ensures that the test data remains unseen during the model training process, providing a realistic evaluation of the model's performance.
Step 3: Visualizing the Data
To gain a better understanding of the dataset, visualizations such as histograms were created. These visualizations revealed the distribution of features, allowing potential patterns or outliers to be identified.
Step 4: Modifying the Data
The data was modified to enhance its quality. This included:
Adding new, relevant features such as
rooms_per_household
.Handling missing values to avoid losing critical information.
Converting categorical features to numerical values using one-hot encoding.
Step 5: Trying Different Models
Several machine learning models were tested to determine which one performed best on the dataset. Testing multiple models ensures that the most suitable one is selected.
Step 6: Fine-Tuning the Model
After selecting the best-performing model, hyperparameter tuning was applied to optimize its performance. This step ensures that the model achieves the best possible results.
Step 7: Testing the Model
Finally, the model was tested on the previously unseen test data. Performance metrics were recorded to evaluate the model's ability to predict housing prices accurately.
Subscribe to my newsletter
Read articles from Priya directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by