โค๏ธ Cardiovascular Risk Assessment and Heart Attack Prediction System

Developed by OpenSere (Abhishek Choudhary), this model predicts the risk of heart attacks using structured medical data. It is designed to assist in pre-diagnostic screening and awareness, especially in low-resource environments.
๐ง Model Summary
This model is a gradient boosting classifier trained using the XGBoost algorithm. It uses patient health indicators to predict the probability of heart attack occurrence (binary classification: 0 = No risk, 1 = At risk).
Architecture
Algorithm: XGBoost Classifier
Objective: Binary Classification
Features: 25+ health metrics including Age, Blood Pressure, Cholesterol, Heart Rate, Diabetes, and Lifestyle indicators like Smoking, Obesity, etc.
Labels:
Heart Attack Risk
(0 or 1)
Characteristics
Handles missing/non-critical fields like
Country
,Continent
,Income
, andPatient ID
gracefullyWell-suited for deployment in healthcare dashboards or apps
Lightweight model with fast inference (<50ms)
โ๏ธ Usage
Code Snippet
pythonCopyEditimport joblib
import pandas as pd
# Load the model
model = joblib.load("heart_model_final.pkl")
# Example input data
input_data = pd.DataFrame([{
"Age": 52,
"Sex": 1,
"Cholesterol": 205,
"Blood Pressure": 135,
"Heart Rate": 85,
"Diabetes": 1,
"Family History": 1,
"Smoking": 0,
"Obesity": 1,
"Alcohol Consumption": 0,
"Exercise Hours Per Week": 1,
"Diet": 1,
"Previous Heart Problems": 0,
"Medication Use": 1,
"Stress Level": 3,
"Sedentary Hours Per Day": 8,
"BMI": 28.5,
"Triglycerides": 180,
"Physical Activity Days Per Week": 3,
"Sleep Hours Per Day": 6
}])
# Predict
prediction = model.predict(input_data)[0]
print("Heart Attack Risk:", "Yes" if prediction == 1 else "No")
Input Format
Inputs: Pandas DataFrame with structured numeric data (no missing mandatory features)
Outputs: Binary class label (0: No risk, 1: Risk)
Known Issues
Model assumes data is clean and properly formatted.
False positives may occur in edge cases like high stress but no other symptoms.
๐งฉ System Design
Standalone: Yes, but can be part of a larger diagnostic suite.
Inputs: CSV files, API feeds, or manual form data
Outputs: Class label, probability scores, and explainable SHAP values (optional)
Optional Inputs:
Patient ID
,Country
,Continent
,Income
,Hemisphere
โ๏ธ Implementation Requirements
Training Environment
Language: Python 3.10+
Libraries:
xgboost
,pandas
,scikit-learn
,joblib
Training Time: ~8 seconds on a laptop (i7, 16GB RAM)
Hardware: No GPU required
Inference: <50ms per sample on CPU
๐ Model Characteristics
Initialization
- Trained from scratch using medical data collected from hospital studies.
Stats
Total Features: 20+ core health indicators
Model Size: ~200 KB
XGBoost Trees: 100 estimators, max depth = 5
Latency: Real-time (<0.05s/sample)
Privacy & Optimization
No pruning or quantization
Differential privacy not applied due to anonymized, structured medical data
๐ Data Overview
Training Data
Source: [Custom medical dataset from Iraq hospital and online sources]
Size: ~500 patients
Features engineered from structured CSV
Preprocessing: Categorical encoding, normalization, NA handling
Demographics
Data represents a diverse sample from Middle Eastern populations
Gender-balanced
Adults aged 20โ85
Evaluation Data
80/20 Train/Test split
Validation using stratified 5-fold cross-validation
Metrics: Accuracy, Precision, Recall, F1 Score, ROC-AUC
โ Evaluation Results
Summary
Metric | Score |
Accuracy | 94.6% |
Precision | 92.1% |
Recall | 96.3% |
F1 Score | 94.1% |
ROC-AUC | 0.98 |
Subgroup Performance
Better recall for older age groups (โฅ50)
Slightly lower precision for diabetic patients due to overlap in symptoms
Fairness
No explicit bias mitigation applied, but training was stratified
No sensitive personal identifiers in the dataset
โ ๏ธ Usage Limitations
Not a substitute for a certified medical diagnosis
Results may be affected by inaccurate or missing user input
Should be validated against local population data before clinical use
๐ Ethics
Designed to empower awareness, not replace doctors
Dataset was anonymized and used for educational, non-commercial purposes
Users are reminded that clinical intervention must follow expert review
๐งช Research Studies and Evaluation Results
Prediction of Cardiovascular Disease Using XGBoost
This study constructs an XGBoost model for cardiovascular disease prediction, demonstrating its effectiveness compared to other algorithms.
๐ Nature Scientific ReportsEnhanced Heart Attack Prediction Using eXtreme Gradient Boosting
A research paper proposing a novel approach leveraging XGBoost for heart attack analysis and prediction, highlighting its superior performance in handling medical data complexities.
๐ Journal of Theoretical and Practical Engineering ScienceEstimation of Risk Factors Related to Heart Attack With XGBoost
This publication discusses the successful classification of heart attack datasets using XGBoost, emphasizing the model's accuracy in identifying associated risk factors.
๐ ResearchGate Publication
๐ง Additional Resources
Prediction on UCI Heart Disease Dataset Using XGBoost
A project that applies XGBoost to the UCI Heart Disease dataset, providing insights into model implementation and performance.
๐ GitHub RepositoryHow XGBoost Can Save Your Heart with 94% Accuracy
An article detailing the application of XGBoost in heart disease prediction, achieving high accuracy and discussing the model's advantages.
๐ Medium Article
These sources provide comprehensive information on datasets, model architectures, evaluation metrics, and real-world applications relevant to heart attack prediction using machine learning techniques like XGBoost. They can serve as valuable references for the development, validation, and documentation of your model.
Subscribe to my newsletter
Read articles from Abhishek Choudhary directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Abhishek Choudhary
Abhishek Choudhary
Iโm Abhishek Choudhary, a Jaipur-based photographer, developer, and human behavior observer with a passion for building thoughtful digital experiences. With a multidisciplinary background, I bring together technology, creativity, and market insight to develop tools, content, and strategies that solve real problems. Iโve worked across startups, created multiple mini projects, and received national recognition for innovation in both mental health and visual storytelling. ๐ง I'm deeply interested in how humans behave, react, and express โ and I often channel this into my Python-based tools, emotion-detection systems, and AI projects. ๐ธ As a photographer and story-filmmaker, I craft visuals that speak โ combining art, light, and emotion. ๐ On the analytical side, I actively engage in stock market research, tracking trends and behavioral patterns to predict outcomes.