ML Primer

Machine Learning Lifecycle:
Understand the Business problem
Data capturing & processing
Explore the data / analysis
Machine learning algorithm
Model Building & predict accuracy
Deploy the model
There are many types of Machine Learning, and we can group them based on what problem they solve
Types of ML Techniques
Supervised Learning -- Data that has been label into categories task driven -- make a prediction. When the labels are known and you want a precise outcome, when you need a specific value returned
Types of Supervised Learning Models
Regression: is a process of finding a function to corelate a dataset into continuous variable/number e.g.: What will be the temperature be next week? Market Forecast
Classification: is a process of finding a function to divide a dataset into classes / categories
e.g.: Will it be cold or Hot Tomorrow
Algorithms for Supervised Learning (Regression)
Simple Linear Regression
Multiple Linear Regression
Polynomial Linear Regression
Support vector Regression
Decision Tree Regression
Random Forest Regression
Algorithms for Supervised Learning (Classification)
Logistic Regression
KNN
Support vector machines
Kernel SVM
Naive Bayes
Decision Tree Classification
Random Forest Classification
Unsupervised Learning: A machine learning task/function that needs no existing training data. It takes unlabelled data and discover its patterns and applying its own labels
It’s like i’m an independent worker, I can figure this out my own.
Types of Supervised Learning Models:
Clustering: Group un label data based on similarities and differences
e.g.: This group may buy Mac, Targeted marketing
Association: Finding a relationship between variables through association
e.g.: Most frequently bought items, if someone buys bread, suggest butter. Customer Recommendation
Dimensionality Reduction: is the process of reducing the number of features (or dimensions) in a dataset while retaining as much information as possible. Often used as preprocessing stage
eg: Bigdata visualization
Algorithms for Un Supervised Learning (Clustering)
K-means
DBScan
K-Modes
Algorithms for Un Supervised Learning (Association)
Apriori
Euclat
FP-Growth
Algorithms for Un Supervised Learning (Dimensionality Reduction)
Principal component analysis (PCA)
Singular value decomposition (SVD)
Linear discriminant analysis (LDA).
Note: Supervised Learning tends to be more accurate than unsupervised but requires more upfront work, Unsupervised Learning still requires human intervention to validate the results.
How to choose right Algorithm?
Factors to Choose Correct Algorithm
The kind of model in use (problem)
Analysing the available Data (size of training set)
The accuracy of the model
Time taken to train the model (training time)
Number of parameters
Number of features Linearity
Key Question 1: What type of problem do you need to solve?
Key Question 2: What type of data do you have?
Key Question 3: What level of interpretability do you need?
Key Question 4: What volume of data do you handle? Understand Your Problem: Begin by gaining a deep understanding on the problem you are trying to solve. What is your goal? What is the problem all about classification, regression, clustering, or something else? What kind of data you are working with?
Process the Data: Ensure that your data is in the right format for your chosen algorithm. Process and prepare your data by cleaning, Clustering, Regression.
Exploration of Data: Conduct data analysis to gain insights into your data. Visualizations and statistics help you to understand the relationships within your data.
Metrics Evaluation: Decide on the metrics that will measure the success of model. You must choose the metric that should align with your problem. Simple models: One should begin with the simple easy-to-learn algorithms. For classification, try regression, decision tree.
Simple model provides a baseline for comparison.
- Reinforcement Learning: an agent operates in an environment and must learn to operate using feedback when there is no data, there is an environment and an ML model generates data many attempt to reach goal Decisions driven -- Game AI, Learning Tasks, Robot Navigation
Learning: Instance based Learning Vs Model Based Learning After machine learning (ML) models are trained on enough data, they are ready to make predictions on unseen data. While some ML models can predict the target aim by comparing the unseen data to the previous data, other ML models derive a mathematical function that enables them to make general predictions without comparing the new data to the trained data. We will call these two broad ML models as instance-based and model-based learning, respectively.
Instance-based learning: For example, in a medical diagnosis application, k-NN could be used to predict a patient's condition based on the stored records of similar patients' symptoms and diagnosis. Model based: For example, in stock market prediction, linear regression could be used to model the relationship between past stock prices and various economic indicators to predict future stock prices.
Steps to build the model
1.Problem Definition
2. Collect Data
3. Clean Your Data
4. Explore Your Data
5. Split Your Data
6. Choose a Model
7. Train Your Model
8. Evaluate Your Model
9. Improve Your Model
10. Deploy Your Model
11. Model Monitoring
12. Model Improvement
Subscribe to my newsletter
Read articles from Ashok Vanga directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Ashok Vanga
Ashok Vanga
Golang Developer and Blockchain certified professional