Naive Bayes Classifier —ML Algorithm


While watching the tutorial, I was confused. What the heck is this? I got the point about how probability works and the mathematical intuition behind this theorem, but I'm confused about where, when, and why I should use this theorem.
And when I tried to focus on its names, I started getting clear answers. as you can see in the demonstration below.
Decompose the name
The name ‘Naive‘ is called that because it assumes features are independent, even if in reality they’re not. This helps for faster speed and works with small data and gives good accuracy. The ‘Bayes‘ refers to the Bayes Theorem. For solving classification problems, this one is good to go.
Now let’s see the proper definition.
It uses probability to determine how likely something belongs to a certain category, given some known data.
To understand “how we implement this theorem“. First, let’s understand the math intuition behind it
The probability of event A, given that B has already occurred, is known as Bayes theorem.
Where,
P(A/B) => Probability of event A, given B has occured
P(A) => Probability of event A
P(B) => Probability of event B
P(B/A) => Probability of event B, given A has occured
We have input features (occurred events), i.e., x1, x2, and x3. Using these, we are going to calculate (predict) y. The formula looks like this
For New Test Data : the denominator is constant, so we are going to remove it.
Example:
This dataset shows data collected over 14 days. It records three things each day:
Weather (Sunny, Overcast, Rain)
Temperature (Hot, Mild, Cool)
Whether tennis was played or not (Yes or No)
Problem Statement:
Based on the weather, which is Sunny and the temperature, which is Hot, predict if tennis will be played or not.
Step 1: Calculate Base Probabilities
First, count how many times tennis was played vs not played:
Tennis played (Yes): 9 days out of 14
Tennis not played (No): 5 days out of 14
Base probabilities:
P(Yes) = 9/14 ≈ 0.643
P(No) = 5/14 ≈ 0.357
Step 2: Calculate Conditional Probabilities
Weather Conditions
Count occurrences for each weather type when tennis was/wasn't played:
When Tennis = Yes (9 days):
Sunny: 2 days → P(Sunny|Yes) = 2/9
Overcast: 4 days → P(Overcast|Yes) = 4/9
Rain: 3 days → P(Rain|Yes) = 3/9
When Tennis = No (5 days):
Sunny: 3 days → P(Sunny|No) = 3/5
Overcast: 0 days → P(Overcast|No) = 0/5
Rain: 2 days → P(Rain|No) = 2/5
Temperature Conditions
Count occurrences for each temperature when tennis was/wasn't played:
When Tennis = Yes (9 days):
Hot: 2 days → P(Hot|Yes) = 2/9
Mild: 4 days → P(Mild|Yes) = 4/9
Cool: 3 days → P(Cool|Yes) = 3/9
When Tennis = No (5 days):
Hot: 2 days → P(Hot|No) = 2/5
Mild: 2 days → P(Mild|No) = 2/5
Cool: 1 day → P(Cool|No) = 1/5
Step 3: Apply Naive Bayes Formula
For our specific question (Sunny AND Hot):
Probability of Playing Tennis:
P(Yes|Sunny,Hot) = P(Yes) × P(Sunny|Yes) × P(Hot|Yes)
P(Yes|Sunny,Hot) = (9/14) × (2/9) × (2/9)
P(Yes|Sunny,Hot) = 0.643 × 0.222 × 0.222
P(Yes|Sunny,Hot) = 0.031
Probability of NOT Playing Tennis:
P(No|Sunny,Hot) = P(No) × P(Sunny|No) × P(Hot|No)
P(No|Sunny,Hot) = (5/14) × (3/5) × (2/5)
P(No|Sunny,Hot) = 0.357 × 0.6 × 0.4
P(No|Sunny,Hot) = 0.085
Step 4: Normalize to Get Final Percentages
Since we need probabilities that sum to 100%, we normalize:
Total probability = 0.031 + 0.085 = 0.116
Final probabilities:
P(Yes|Sunny,Hot) = 0.031 ÷ 0.116 = 0.27 = 27%
P(No|Sunny,Hot) = 0.085 ÷ 0.116 = 0.73 = 73%
Conclusion
When the weather is Sunny and temperature is Hot, there is a:
27% chance tennis will be played
73% chance tennis will NOT be played
This makes sense when you look at the data - both times it was sunny and hot (days 1 and 2), tennis was not played!
Subscribe to my newsletter
Read articles from Vishal Rewaskar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
