Smarter Predictions with Random Forest Regression | My Machine Learning Journey


Introduction:
Hey there!
As part of my ML learning journey, I recently implemented a Random Forest Regression model — and I have to say, it was quite the experience!
In this post, I’ll walk you through what Random Forest Regression is, how it improves over Decision Trees, and how I used it to predict a continuous outcome in a project using Python and Scikit-learn. If you're curious about ensemble models or how forests can outperform individual trees — you're in the right place!
→ What is Random Forest Regression?
Random Forest is an ensemble learning technique — meaning it combines multiple models to produce better results. In this case, it uses multiple Decision Trees to make predictions and then averages the result.
How it works:
It builds several Decision Trees using different parts of the dataset (thanks to bootstrapping).
Each tree gives a prediction.
The final prediction is the average of all trees’ outputs.
This technique reduces overfitting and increases model accuracy and stability compared to a single Decision Tree.
→Why Use Random Forest Regression?
Reduces variance: No more wild predictions like in a single decision tree.
Handles non-linearity: Great for real-world messy data.
Robust and scalable: Works well even when you have lots of features or data.
→Tools & Libraries:
Python
NumPy
Pandas
Matplotlib
Scikit-learn (RandomForestRegressor)
📊 Dataset:
I used a clean dataset where the goal was to predict a salary based on the position level — ideal for regression analysis.
Position Level | Salary |
1 | 45000 |
2 | 50000 |
3 | 60000 |
... | ... |
10 | 1000000 |
→Implementation in Python:
1. Import the Libraries
pythonCopyEditimport numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestRegressor
2. Load the Dataset
pythonCopyEditdataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
3. Train the Random Forest Regressor
pythonCopyEditregressor = RandomForestRegressor(n_estimators=100, random_state=0)
regressor.fit(X, y)
n_estimators=100
means we're using 100 trees 🌲.random_state
ensures consistent results.
4. Make a Prediction
pythonCopyEdity_pred = regressor.predict([[6.5]])
print(f"Predicted Salary for level 6.5: {y_pred}")
5. Visualize the Results (with higher resolution)
pythonCopyEditX_grid = np.arange(min(X), max(X), 0.01)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color='red')
plt.plot(X_grid, regressor.predict(X_grid), color='green')
plt.title('Random Forest Regression')
plt.xlabel('Position Level')
plt.ylabel('Salary')
plt.show()
📈 Output:
The plot is more granular and step-wise, showing how multiple trees average out predictions. The predicted salary for 6.5 is more stable and accurate compared to decision trees.
→Key Takeaways:
Random Forest performs better than single decision trees due to averaging.
It’s one of the most powerful and widely-used algorithms for regression and classification.
Minimal preprocessing needed: no scaling or feature engineering headaches.
The model is interpretable to some extent and very reliable.
→ What I Learned:
The importance of ensemble models in reducing overfitting
How increasing the number of estimators (trees) can improve accuracy
Visualization can reveal how the model captures trends in data
Random Forest is often a solid baseline model in any regression task
Subscribe to my newsletter
Read articles from Lokesh Patidar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Lokesh Patidar
Lokesh Patidar
Hey, I'm Lokesh Patidar! I'm a 2nd-year student at SATI Vidisha, passionate about AI, Machine Learning, Full-Stack Development , and DSA. What I'm Learning: Currently Exploring Machine Learning 🤖 Completed DSA & Frontend Development 🌐 Now exploring Backend Development 💡 Interests: I love solving problems, building projects, and integrating AI into real-world applications. Excited to contribute to tech communities and share my learning journey! 📌 Follow my blog for insights on AI, ML, and Full-Stack projects!