Naive Bayes Classifier
I'm sure we've all had our fair share of encounters with probabilities, some people have it worse with conditional probability and I think Bayes theorem is when it becomes advanced. But to be honest, what is advanced in some basic "The probability of A given B is equals the probability of B given A multiplied by the probability of A all divided by the probability of B". Read this out loud for context:
P(A/B) = [P(B/A) * P(A)]/P(B)
It is called the Bayes theorem. It is used in machine learning to predict target values by using the conditional probability of the features.
The naive part stems out of the assumption that all the features are actually without doubt, independent of each other as this strengthens the idea of the probabilities being independent and conditional rather than dependent. Even though this might not always be absolutely true, naivety ensures simplicity.
There are three types of naive bayes model to be used in ML. They are Bernoulli, Multinomial and Gaussian. Bernoulli is used when all the features are binary, 0 and 1, yes and no features only. Multinomial is used when the values of the features are discrete or completely categorical with different figures representing different categories. And Gaussian is used when the features contain continous values.
My mini project uses the properties of wine to classify the wine into different classes: class_1, class_2 and class_3. I tested both the multinomial and gaussian naive bayes models to see which one worked best. I also used both the test train split method and the K-Fold cross validation technique to ensure I get more solid results.
Part 1
part 2
part 3
part 4
part 5
Part 6 (The End)
As visible in the images of the workbook, Gaussian is superbly accurate. I would tell you that this is essentially due to the features like alcohol etc being continous values (check fig 2 above).
So, to use naive bayes......
from sklearn.naive_bayes import MultinomialNB
from sklearn.naive_bayes import GaussianNB
from sklearn.naive_bayes import BernoulliNB
MNB = MultinomialNB()
GNB = GaussianNB()
BNB = BernoulliNB()
So, yes bye.
ciao
Subscribe to my newsletter
Read articles from Hassan Abubakar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by