How to Start a Machine Learning Problem
1. Project Scooping
You should have the basic knowledge of what kind of project you are doing, whether it is a Supervised or Unsupervised machine learning problem. What kind of problem machine learning can or can't solve for e.g. if your data contains a lot of noise or incomplete and imbalanced data no matter how robust algorithms you use there will be error for sure.
2. Data Management
Data generated for ML Systems can be large and diverse and which require scalable infrastructure. Database query can be a way to do that. SQL is a type of Database query language which helps in data preprocessing and management etc. SQL can extract specific subset of information required from your dataset to use in your machine learning model. Although, for a beginner it is not necessary to learn SQL in order to solve machine learning problems but you should have a basic understanding of ML theory and python.
3. ML Model Development
After we have extracted the data after performing data preprocessing methods its time to analyze the data with the help of visualization and statistics. Barplots , scatterplots , pie-chart are the different ways to understand the structure of data before we dive deep into it. It is also called EDA as in Exploratory data analysis. Exploring the data before applying machine learning algorithms.
Statistics and probability are the important factors in order to understand the data because they predict patterns which can be used for making better predictions.
After exploring the data we apply various machine learning algorithms depending upon the type of data we have, whether it is classification , regression , supervised or unsupervised learning problem.
4. Deployment
Once you have created the model you have to deploy it.You can do it with the help of docker or flask. Basic knowledge of RESTful API principles and how to create them using frameworks like Flask or FastAPI can be helpful.
5 . Monitoring and Maintenance
Once in production , model needs to be monitored for performance decay and maintained adaptive for changing environments. You can do this by understanding of data pipeline.
6. Business Analysis
Model performance needs to be evaluated against business goals and analyzed to generate business insights. You should have in-depth knowledge of statistics and probability to extract insights from data.
source: https://huyenchip.com/machine-learning-systems-design/toc.html
Subscribe to my newsletter
Read articles from Meemansha Priyadarshini directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Meemansha Priyadarshini
Meemansha Priyadarshini
I am a certified TensorFlow Developer and enjoy writing blogs to share my knowledge and assist others.