Real-Time Bitcoin Price Prediction Using LSTM & SQL Integration


Problem Statement
Cryptocurrency prices are highly volatile and influenced by various unpredictable factors such as market sentiment, economic policies, and global events.
For traders and investors, being able to predict short-term price movements can help in making informed decisions and reducing risk exposure.
The goal of this project is to:
Build a data-driven system that predicts the next closing price of Bitcoin.
Use historical time-series data stored in a SQL database.
Automate the end-to-end pipeline — from fetching data, preprocessing, training an LSTM model, and making predictions.
🛠 Tech Stack
Python
TensorFlow / Keras
Pandas, NumPy, Scikit-learn
SQLAlchemy (Database Connection)
MinMaxScaler (Feature Scaling)
ModelCheckpoint (Best Model Saving)
Custom Logging (Error Tracking)
🔍 Workflow with Code Snippets
1️⃣ Fetching Data from SQL
We connect to the database using SQLAlchemy and fetch Bitcoin price history.
# data_fetcher.py
query = text("""
SELECT timestamp, current_price
FROM CryptoMarketData
WHERE coin_id = :coin_name
ORDER BY timestamp ASC
""")
result = self.session.execute(query, {'coin_name': coin_name})
df = pd.DataFrame(result.fetchall(), columns=["timestamp", "current_price"])
df['timestamp'] = pd.to_datetime(df['timestamp'])
💡 Explanation:
This query retrieves timestamp
and current_price
for the given coin from the CryptoMarketData
table, ordered chronologically.
2️⃣ Data Preprocessing for LSTM
Before feeding the data into the model, we scale and sequence it.
# data_preprocessor.py
scaled = self.scaler.fit_transform(df[['current_price']])
# Create sequences of length `time_steps`
for i in range(self.time_steps, len(scaled)):
X.append(scaled[i - self.time_steps:i])
y.append(scaled[i])
💡 Explanation:
Scaling keeps all values between
0
and1
for stable training.Sequences are chunks of historical prices used to predict the next price.
3️⃣ Building the LSTM Model
Our LSTM has two stacked LSTM layers with dropout for regularization.
# model_builder.py
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=self.input_shape))
model.add(Dropout(0.2))
model.add(LSTM(64))
model.add(Dropout(0.2))
model.add(Dense(1))
💡 Explanation:
First LSTM layer outputs sequences to feed into the second LSTM layer.
Dropout prevents overfitting.
Dense(1) outputs a single predicted price value.
4️⃣ Training the Model
ModelCheckpoint
ensures the best version of the LSTM model is saved during training by monitoring the loss and only updating the file when performance improves. This prevents overfitting from later epochs, saves time by avoiding retraining, and guarantees that the most accurate model is ready for predictions.
# trainer.py
callbacks = [
ModelCheckpoint(filepath=model_path, save_best_only=True, monitor='loss', verbose=1)
]
history = self.model.fit(
X_train, y_train,
epochs=self.epochs,
batch_size=self.batch_size,
validation_data=(X_val, y_val),
callbacks=callbacks
)
filepath=model_path
→ Tells Keras where to save the.h5
model file.save_best_only=True
→ Avoids overwriting with worse versions.monitor='loss'
→ Saves the model when the training loss decreases (you could also monitor validation loss).verbose=1
→ Prints a message whenever the model is saved.
5️⃣ Making Predictions
We load the trained model and predict the next price.
# predictor.py
last_sequence = scaled_data[-Config.TIME_STEPS:]
input_data = np.expand_dims(last_sequence, axis=0)
predicted_scaled = self.model.predict(input_data)
predicted_price = self.preprocessor.inverse_scale(predicted_scaled)[0][0]
💡 Explanation:
We take the last
TIME_STEPS
prices, scale them, and feed them into the model.Output is inverse-scaled back to the actual price.
6️⃣ End-to-End Pipeline
We connect all the components into one smooth execution.
# pipeline.py
fetcher = CryptoDataFetcher()
df = fetcher.fetch_coin_data(coin_name)
preprocessor = DataPreprocessor()
scaled_data = preprocessor.scale_data(df)
X, y = preprocessor.create_sequences(scaled_data)
X_train, y_train, X_test, y_test = preprocessor.train_test_split(X, y)
builder = LSTMModelBuilder(input_shape=(X_train.shape[1], X_train.shape[2]))
model = builder.build_model()
trainer = ModelTrainer(model, coin_name)
trainer.train(X_train, y_train, X_val=X_test, y_val=y_test)
predictor = CryptoPricePredictor(coin_name)
print(f"✅ Predicted Next Price: ₹{predictor.predict_next_price(df):.2f}")
How Data Flows Through My Bitcoin Price Prediction System
Inside the Model: Architecture and Design Choices
Why the Model is Built This Way
We use 60 timesteps because it’s like giving the model a 2-month “memory” of Bitcoin’s price. That’s enough to catch meaningful trends without drowning it in too much history or slowing training.
Each LSTM layer has 64 units — a good middle ground where the model is smart enough to learn complex patterns but still trains quickly.
We add a 20% dropout so the model doesn’t get too “attached” to specific neurons. It’s like making it work with slightly different teammates each round, which helps it generalize better.
The final Dense layer has 1 neuron because we just want one thing: the next predicted price.
For training, Mean Squared Error (MSE) is perfect here — it punishes big mistakes more than small ones, which keeps predictions accurate.
And we use the Adam optimizer because it learns fast, adapts well to different data patterns, and works great for messy, unpredictable crypto price data.
Results
The LSTM model was trained to predict the next minute’s price for a selected cryptocurrency. Below is an example where Bitcoin’s predicted price is ₹113,951.23.
Key Takeaways
Built a modular end-to-end LSTM pipeline for minute-level price prediction.
Used SQL as the data source, ensuring the model trains on real historical data.
Created separate models for each coin so predictions remain coin-specific.
Integrated an interactive dashboard for quick, user-friendly predictions.
Future Enhancements
Current implementation focuses on learning LSTM concepts with minute-level predictions.
The pipeline is flexible for future enhancements.
Integrate MLflow to:
Track experiments
Manage model versions
Streamline retraining as new data is added
Expand the model to per-hour predictions for more strategic and less noisy forecasts.
Implement a multi-coin single model to avoid retraining for every cryptocurrency.
End Note
This project started as a mini-experiment to understand how LSTM models handle time-series data — and quickly became a fully functional next-minute crypto price predictor. By combining real-time market data from CoinGecko, a clean preprocessing pipeline, and a modular LSTM architecture, we built something both educational and practical.
The real win here isn’t just the predictions — it’s the foundation we’ve created. With planned enhancements like MLflow integration, per-hour forecasting, and real-time deployment, this project can easily grow into a more robust and production-ready system.
For now, it stands as a hands-on learning milestone and a great reminder that even small projects can teach big lessons.
Here is the link:
Subscribe to my newsletter
Read articles from Nilanjan Sarkar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
