ML Primer

Ashok VangaAshok Vanga
4 min read

Machine Learning Lifecycle:

Understand the Business problem

Data capturing & processing

Explore the data / analysis

Machine learning algorithm

Model Building & predict accuracy

Deploy the model

There are many types of Machine Learning, and we can group them based on what problem they solve

Types of ML Techniques

  1. Supervised Learning -- Data that has been label into categories task driven -- make a prediction. When the labels are known and you want a precise outcome, when you need a specific value returned

    Types of Supervised Learning Models

    Regression: is a process of finding a function to corelate a dataset into continuous variable/number e.g.: What will be the temperature be next week? Market Forecast

    Classification: is a process of finding a function to divide a dataset into classes / categories

    e.g.: Will it be cold or Hot Tomorrow

Algorithms for Supervised Learning (Regression)

Simple Linear Regression

Multiple Linear Regression

Polynomial Linear Regression

Support vector Regression

Decision Tree Regression

Random Forest Regression

Algorithms for Supervised Learning (Classification)

Logistic Regression

KNN

Support vector machines

Kernel SVM

Naive Bayes

Decision Tree Classification

Random Forest Classification

  1. Unsupervised Learning: A machine learning task/function that needs no existing training data. It takes unlabelled data and discover its patterns and applying its own labels

    It’s like i’m an independent worker, I can figure this out my own.

    Types of Supervised Learning Models:

    Clustering: Group un label data based on similarities and differences

    e.g.: This group may buy Mac, Targeted marketing

    Association: Finding a relationship between variables through association

    e.g.: Most frequently bought items, if someone buys bread, suggest butter. Customer Recommendation

    Dimensionality Reduction: is the process of reducing the number of features (or dimensions) in a dataset while retaining as much information as possible. Often used as preprocessing stage

    eg: Bigdata visualization

Algorithms for Un Supervised Learning (Clustering)

K-means

DBScan

K-Modes

Algorithms for Un Supervised Learning (Association)

Apriori

Euclat

FP-Growth

Algorithms for Un Supervised Learning (Dimensionality Reduction)

Principal component analysis (PCA)

Singular value decomposition (SVD)

Linear discriminant analysis (LDA).

Note: Supervised Learning tends to be more accurate than unsupervised but requires more upfront work, Unsupervised Learning still requires human intervention to validate the results.

How to choose right Algorithm?

Factors to Choose Correct Algorithm

The kind of model in use (problem)

Analysing the available Data (size of training set)

The accuracy of the model

Time taken to train the model (training time)

Number of parameters

Number of features Linearity

Key Question 1: What type of problem do you need to solve?

Key Question 2: What type of data do you have?

Key Question 3: What level of interpretability do you need?

Key Question 4: What volume of data do you handle? Understand Your Problem: Begin by gaining a deep understanding on the problem you are trying to solve. What is your goal? What is the problem all about classification, regression, clustering, or something else? What kind of data you are working with?

Process the Data: Ensure that your data is in the right format for your chosen algorithm. Process and prepare your data by cleaning, Clustering, Regression.

Exploration of Data: Conduct data analysis to gain insights into your data. Visualizations and statistics help you to understand the relationships within your data.

Metrics Evaluation: Decide on the metrics that will measure the success of model. You must choose the metric that should align with your problem. Simple models: One should begin with the simple easy-to-learn algorithms. For classification, try regression, decision tree.

Simple model provides a baseline for comparison.

  1. Reinforcement Learning: an agent operates in an environment and must learn to operate using feedback when there is no data, there is an environment and an ML model generates data many attempt to reach goal Decisions driven -- Game AI, Learning Tasks, Robot Navigation

Learning: Instance based Learning Vs Model Based Learning After machine learning (ML) models are trained on enough data, they are ready to make predictions on unseen data. While some ML models can predict the target aim by comparing the unseen data to the previous data, other ML models derive a mathematical function that enables them to make general predictions without comparing the new data to the trained data. We will call these two broad ML models as instance-based and model-based learning, respectively.

Instance-based learning: For example, in a medical diagnosis application, k-NN could be used to predict a patient's condition based on the stored records of similar patients' symptoms and diagnosis. Model based: For example, in stock market prediction, linear regression could be used to model the relationship between past stock prices and various economic indicators to predict future stock prices.

Steps to build the model

1.Problem Definition

2. Collect Data

3. Clean Your Data

4. Explore Your Data

5. Split Your Data

6. Choose a Model

7. Train Your Model

8. Evaluate Your Model

9. Improve Your Model

10. Deploy Your Model

11. Model Monitoring

12. Model Improvement

0
Subscribe to my newsletter

Read articles from Ashok Vanga directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ashok Vanga
Ashok Vanga

Golang Developer and Blockchain certified professional