How to Create a Simple Stock Price Prediction Project Using Linear Regression

Raka AlfariziRaka Alfarizi
3 min read

Step 1: Setting Up Your Environment

  1. Install Python: Ensure Python is installed on your computer. Download the latest version from python.org.

  2. Install Jupyter Notebook: Install Jupyter Notebook to organize your code and analysis.

     pip install jupyterlab
    
  3. Install Required Libraries: Install libraries such as numpy, pandas, matplotlib, scikit-learn, and yfinance.

     pip install numpy pandas matplotlib scikit-learn yfinance
    

Step 2: Collecting Stock Data

  1. Open Jupyter Notebook:

     jupyter notebook
    
  2. Import Libraries:

     import numpy as np
     import pandas as pd
     import matplotlib.pyplot as plt
     from sklearn.model_selection import train_test_split
     from sklearn.linear_model import LinearRegression
     import yfinance as yf
    
  3. Download Stock Data:

    • Choose the stock you want to predict (e.g., Apple Inc. with ticker AAPL).

    • Download historical stock data using yfinance.

    # Download stock data
    data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')

    # Display the first few rows
    print(data.head())

Step 3: Understanding and Cleaning Data

  1. Visualize Data:

     plt.figure(figsize=(10, 6))
     plt.plot(data['Close'])
     plt.title('Apple Inc. Closing Price')
     plt.xlabel('Date')
     plt.ylabel('Closing Price (USD)')
     plt.show()
    
  2. Focus on the Closing Price Column:

    • We will use the Close column for predicting stock prices.
    df = data[['Close']].copy()
    df['Next Close'] = df['Close'].shift(-1)  # Next day's closing price
    df.dropna(inplace=True)

Step 4: Splitting Data for Training and Testing

  1. Prepare the Data:

     X = df[['Close']].values
     y = df['Next Close'].values
    
     X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)
    

Step 5: Building the Linear Regression Model

  1. Train the Model:

     model = LinearRegression()
     model.fit(X_train, y_train)
    
  2. Make Predictions:

     y_pred = model.predict(X_test)
    

Step 6: Evaluating the Model

  1. Visualize Prediction Results:

     plt.figure(figsize=(10, 6))
     plt.plot(y_test, label='Actual Closing Price')
     plt.plot(y_pred, label='Predicted Closing Price')
     plt.legend()
     plt.title('Apple Inc. Closing Price Prediction')
     plt.xlabel('Days')
     plt.ylabel('Closing Price (USD)')
     plt.show()
    
  2. Calculate Mean Absolute Error (MAE):

     copfrom sklearn.metrics import mean_absolute_error
    
     mae = mean_absolute_error(y_test, y_pred)
     print(f'Mean Absolute Error: {mae}')
    

Step 7: Documentation and Publication

  1. Document Your Project:

    • Document each step in the notebook.

    • Explain each part of the code and the results obtained.

  2. Publish on GitHub:

    • Create a new repository on GitHub.

    • Upload the Jupyter notebook (.ipynb) to the repository.

    • Add a README explaining your project, how to run it, and the results obtained.

Complete Code Example

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import yfinance as yf
from sklearn.metrics import mean_absolute_error

# Download stock data
data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')

# Display the first few rows
print(data.head())

# Visualize data
plt.figure(figsize=(10, 6))
plt.plot(data['Close'])
plt.title('Apple Inc. Closing Price')
plt.xlabel('Date')
plt.ylabel('Closing Price (USD)')
plt.show()

# Focus on the closing price column
df = data[['Close']].copy()
df['Next Close'] = df['Close'].shift(-1)  # Next day's closing price
df.dropna(inplace=True)

# Prepare the data
X = df[['Close']].values
y = df['Next Close'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

# Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Visualize prediction results
plt.figure(figsize=(10, 6))
plt.plot(y_test, label='Actual Closing Price')
plt.plot(y_pred, label='Predicted Closing Price')
plt.legend()
plt.title('Apple Inc. Closing Price Prediction')
plt.xlabel('Days')
plt.ylabel('Closing Price (USD)')
plt.show()

# Calculate Mean Absolute Error (MAE)
mae = mean_absolute_error(y_test, y_pred)
print(f'Mean Absolute Error: {mae}')

Conclusion

By following these steps, you can create a simple stock price prediction project using linear regression. This project will help you understand the basics of data processing, machine learning, and result visualization, which are essential skills for developing an AI Agency in the future. Publish this project on GitHub and share your results to build your portfolio.

0
Subscribe to my newsletter

Read articles from Raka Alfarizi directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Raka Alfarizi
Raka Alfarizi