Using MindsDB to predict the risk of heart disease
This is a guest post from MindsDB community written by Mohd Talha
Cardiovascular disease remains the leading cause of morbidity and mortality according to the National Center for Health Statistics in the United States, and consequently, early diagnosis is of paramount importance. Machine learning technology, a subfield of artificial intelligence, is enabling scientists, clinicians and patients to detect it in the earlier stages and therefore save lives.
Until now, building, deploying and maintaining applied machine learning solutions was a complicated and expensive task, because it required skilled personnel and expensive tools. But not only that. A traditional machine learning project requires building integrations with data and applications, that is not only a technical but also an organizational challenge. So what if machine learning can become a part of the standard tools that are already in use?
This is exactly the problem that MindsDB is solving. It makes machine learning easy to use by automating and integrating it into the mainstream instruments for data and analytics, namely databases and business intelligence software. It adds an AI “brain” to databases so that they can learn automatically from existing data, allowing you to generate and visualize predictions using standard data query language like SQL. Lastly, MindsDB is open-source, and anyone can use it for free.
In this article, we will show step by step how to use MindsDB inside databases to predict the risk of heart disease for patients.
You can follow this tutorial by connecting to your own database and using different data - the same workflow applies to most machine learning use cases. Let’s get started!
Pre-requisites
If you want to install MindsDB locally, check out the installation guide for Docker or PyPi and you can follow this tutorial.
If you are OK with using MindsDB cloud, then simply create a free account and you will be up and running in just one minute.
Second, you will need to have a MySQL client like DBeaver, MySQL Workbench, etc. installed locally to connect to the MindsDB MySQL API.
Connect to your data
First, we need to connect MindsDB to the database where the Heart Disease data is stored. If you use MindsDB Studio, then in the left navigation bar click on Datasources. Next, click on the MySQL DATABASE button. Then provide all of the required parameters for connecting to the database and Run the Query.
You can also use any MySQL client to connect to MindsDB’s MySQL API to connect database and train a new model that shall predict the risk of heart disease for a certain patient. Please refer to MindsDB docs.
How to use MindsDB
MindsDB allows you to automatically create & train machine learning models from the data in your database that you have connected to in the previous step. MindsDB works via MySQL Wire protocol, which means you can do all these steps through SQL commands. When it comes to making predictions, SQL queries become even handier because you can easily make them straight from your existing applications or Business Intelligence tools that already speak SQL. The ML models are available to use immediately after being trained as if they were virtual database tables (a concept called “AI Tables”).
So, let’s see how it works.
The first step is to connect to MindsDB’s MySQL API. Go to your MySQL client and execute:
mysql -h cloud.mindsdb.com --port 3306 -u theusername@mail.com -p
In the above command, we specify the hostname and user name, as well as a password for connecting. If you use a local instance of MindsDB you have to specify its parameters. Please refer to the documentation.
If you have an authentication error, please make sure you are providing the email address you have used to create an account on MindsDB Cloud.
Data Overview
For the example of this tutorial, we will use the heart disease dataset available publicly in Kaggle. Each row represents a patient and we will train a machine learning model to help us predict if the patient is classified as a heart disease patient. Below is a short description of each feature inside the data.
age - In Years
sex - 1 = Male; 0 = Female
cp - chest pain type (4 values)
trestbps - Resting blood pressure (in mm Hg on admission to the hospital)
chol - Serum cholesterol in mg/dl
fbs - Fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
restecg - Resting electrocardiographic results
thalach - Maximum heart rate achieved
exang - Exercise induced angina (1 = yes; 0 = no)
oldpeak - ST depression induced by exercise relative to rest
slope - the slope of the peak exercise ST segment
ca - Number of major vessels (0-3) colored by fluoroscopy
thal - 3 = normal; 6 = fixed defect; 7 = reversible1 defect
target - 1 or 0 (This is what we will predict)
Using SQL Statements to train ML models automatically
Now, we will train a new machine learning model from the datasource we have created.
Go to your mysql-client and run:
use mindsdb;
show tables;
You will notice there are two tables available inside the MindsDB database. To train a new machine learning model, we will need to CREATE Predictor as a new record inside the predictors table as:
CREATE PREDICTOR predictor_name
FROM integration_name
(SELECT column_name, column_name2 FROM table_name)
PREDICT column_name AS column_alias;
The required values that we need to provide are:
predictor_name (string) - The name of the model
integration_name (string) - The name of the connection to your database.
column_name (string) - The feature you want to predict.
To train the model that will predict the risk of heart disease as target run:
CREATE PREDICTOR patients_target FROM hdidemo (SELECT * FROM HeartDiseaseData)
PREDICT target USING {"ignore_columns": ["sex"]};
What we did here was to create a predictor called patients_target
to predict the presence of heart disease as target
and also ignore the sex
column as an irrelevant column for the model. The model has started training. To check if the training has finished, you can SELECT the model name from the predictors table:
SELECT * FROM mindsdb.predictors WHERE name=’patients_target’;
The complete status means that the model training has successfully finished.
Using SQL Statements to make predictions
The following steps would be to query the model and predict the heart disease risk. Let’s imagine a patient. This patient’s age is 30, she has a cholesterol level of 177 mg/dl, with a slope of the peak exercise ST segment as 2, and thal as 2. Add all of this information to the WHERE
clause.
SELECT target as prediction, target_confidence as confidence, target_explain as info FROM mindsdb.patients_target WHERE when_data='{"age": 30, "chol": 177, "slope": 2, "thal": 2}';
With a confidence of around 99%, MindsDB predicted a high risk of heart disease for this patient.
The above example shows how you can make predictions for a single patient. But what if you have a table in your database with many patients’ diagnosis data, and you want to make predictions for them in bulk?
For this purpose, you can join the predictor with such a table.
SELECT * FROM db_integration.HeartDiseaseData AS t JOIN mindsdb.patients_target AS tb WHERE t.age in ('45');
Now you can connect the output table to your BI tool and for more convenient visualization of the results using graphs or pivots.
Conclusion
In this tutorial, you have seen how easy it is to apply machine learning for your predictive needs. MindsDB's innovative open-source technology is making it easy to leverage machine learning for people who are not experts in this field. However, MindsDB is a great tool for ML practitioners as well: if you are a skilled data scientist, you could also benefit from the convenience of deploying custom machine learning solutions within databases by building & configuring models manually through a declarative syntax called JSON-AI.
There are other interesting ML use cases where MindsDB is positioned exceptionally well, like multivariate time-series and real-time data streams, so feel free to try MindsDB yourself! Simply sign up for a free cloud account and you will be up and running in 5 minutes. And if you need any help with MindsDB, feel free to ask Slack or Github community.
If this article was helpful, please give us a GitHub star here.
Subscribe to my newsletter
Read articles from MindsDB directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
MindsDB
MindsDB
MindsDB enables developers to create the next wave of AI-centered applications that will transform the way we live and work. MindsDB was founded in 2017 by Adam Carrigan (COO) and Jorge Torres (CEO) and is backed by Benchmark, Mayfield, Nvidia's NVentures, YCombinator and others. MindsDB is also recognized by Forbes as one of America's most promising AI companies (2021) and by Gartner as a Cool Vendor for Data and AI (2022). To see how MindsDB can help you visit www.mindsdb.com or follow us @MindsDB.