Building an LSTM-Based Stock Price Prediction Model: An End-to-End Guide

In this blog post, we’ll walk you through creating an end-to-end machine learning application to predict stock market prices. We'll use Python, TensorFlow, Keras, and Flask for this project. Let's dive in step-by-step, from data collection to deployment as a real-time web application.

Step 1: Data Collection and Preprocessing

First, we need to gather historical stock market data. We'll use the yfinance library to fetch this data for Google (GOOG) from 2012 to 2022.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf

# Define the time period and stock ticker
start = '2012-01-01'
end = '2022-12-21'
stock = 'GOOG'

# Download the stock data
data = yf.download(stock, start, end)
data.reset_index(inplace=True)

Step 2: Visualizing the Data

To understand the data better, we visualize the closing prices along with the 100-day and 200-day moving averages.

# Calculate moving averages
ma_100_days = data.Close.rolling(100).mean()
ma_200_days = data.Close.rolling(200).mean()

# Plot the closing prices and moving averages
plt.figure(figsize=(10, 6))
plt.plot(ma_100_days, 'r', label='100-Day MA')
plt.plot(ma_200_days, 'b', label='200-Day MA')
plt.plot(data.Close, 'g', label='Close Price')
plt.legend()
plt.show()

Step 3: Preparing the Data for Training

Next, we prepare the data for training our LSTM model. We split the data into training and testing sets and scale it using MinMaxScaler.

from sklearn.preprocessing import MinMaxScaler

# Drop missing values
data.dropna(inplace=True)

# Split the data into training and testing sets
train_size = int(len(data) * 0.80)
data_train = data.Close[:train_size]
data_test = data.Close[train_size:]

# Scale the data
scaler = MinMaxScaler(feature_range=(0, 1))
data_train_scaled = scaler.fit_transform(data_train.values.reshape(-1, 1))

# Create sequences of 100 days for training
x_train, y_train = [], []
for i in range(100, len(data_train_scaled)):
    x_train.append(data_train_scaled[i-100:i])
    y_train.append(data_train_scaled[i, 0])

x_train, y_train = np.array(x_train), np.array(y_train)

Step 4: Building the LSTM Model

Now, let's build our LSTM model using Keras. Our model will have multiple LSTM layers, each followed by dropout layers to prevent overfitting.

from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM

# Define the LSTM model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))

model.add(LSTM(60, activation='relu', return_sequences=True))
model.add(Dropout(0.3))

model.add(LSTM(80, activation='relu', return_sequences=True))
model.add(Dropout(0.4))

model.add(LSTM(120, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(1))

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(x_train, y_train, epochs=50, batch_size=32, verbose=1)

Step 5: Testing the Model

After training, we need to evaluate our model on the test data. We also rescale the predictions back to the original scale.

# Prepare the test data
past_100_days = data_train.tail(100)
data_test = pd.concat([past_100_days, data_test], ignore_index=True)
data_test_scaled = scaler.transform(data_test.values.reshape(-1, 1))

# Create sequences for testing
x_test, y_test = [], []
for i in range(100, len(data_test_scaled)):
    x_test.append(data_test_scaled[i-100:i])
    y_test.append(data_test_scaled[i, 0])

x_test, y_test = np.array(x_test), np.array(y_test)

# Make predictions
y_pred = model.predict(x_test)

# Rescale the predictions
scale = 1 / scaler.scale_[0]
y_pred = y_pred * scale
y_test = y_test * scale

# Plot the results
plt.figure(figsize=(10, 6))
plt.plot(y_test, 'g', label='Original Price')
plt.plot(y_pred, 'r', label='Predicted Price')
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend()
plt.show()

Step 6: Saving the Model

Finally, we save our trained model so it can be loaded and used later for real-time predictions.

model.save('Stock_Predictions_Model.keras')

Conclusion

In this tutorial, we've built an end-to-end machine learning application to predict stock prices. We covered data collection, preprocessing, model building, training, and evaluation. This project provides a solid foundation for more advanced stock prediction models and real-time applications. Happy coding!

5
Subscribe to my newsletter

Read articles from Hairulnizam Hashim directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Hairulnizam Hashim
Hairulnizam Hashim