Don't Be Confused by Confusion Matrix - Simple Words, Strong understanding

Pradeep MalagePradeep Malage
3 min read

🤔 What is a Confusion Matrix?

If you’ve just started learning Machine Learning, and you're hearing terms like True Positive, False Negative, and wondering what confusion matrix even means — this blog is for you.

Let’s break it down in the most simple, practical way possible.

🧠 Think of a Confusion Matrix Like a Truth Table for Predictions

When you train a classification model, it tries to predict classes — like:

  • Is an email spam or not?

  • Does a patient have diabetes or not?

To check how well the model is doing, we compare:

  • Actual value (what really happened)

  • Predicted value (what the model guessed)

The confusion matrix gives a summary table of how often it was right or wrong.

📊 Confusion Matrix Layout (Binary Classification)

Let’s start with a basic 2-class (Yes/No) example.

Predicted: YesPredicted: No
Actual: Yes✅ True Positive (TP)❌ False Negative (FN)
Actual: No❌ False Positive (FP)✅ True Negative (TN)

🎯 Real-World Example: COVID Test

Imagine you take a COVID test. The test is a machine learning model.

Test Says PositiveTest Says Negative
You Actually Have COVIDTP (Caught it right)FN (Missed it!)
You Don’t Have COVIDFP (False alarm!)TN (All good)

💡 How to Remember These Terms?

TermMeaningMemory Trick
TPModel was right about a “YES”✅ Correct YES
TNModel was right about a “NO”✅ Correct NO
FPModel said YES but it was NO❌ “False Alarm”
FNModel said NO but it was YES❌ “Oops, Missed It!”

📈 Metrics You Can Calculate

From this confusion matrix, we calculate some important model performance numbers:

1. Accuracy

How often is the model correct?

Accuracy=TP+TNTotal\text{Accuracy} = \frac{TP + TN}{Total}Accuracy=TotalTP+TN​

Example: Out of 100 predictions, if 90 are right, accuracy = 90%


2. Precision

When the model says “Yes,” how often is it correct?

Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}Precision=TP+FPTP​

Used when false positives are costly (e.g., false cancer alarm).


3. Recall (Sensitivity)

How many actual "Yes" cases did the model catch?

Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}Recall=TP+FNTP​

Used when missing positives is dangerous (e.g., missing COVID cases).


4. F1 Score

A balance between Precision and Recall.

F1 Score=2⋅(Precision⋅Recall)Precision+Recall\text{F1 Score} = \frac{2 \cdot (Precision \cdot Recall)}{Precision + Recall}F1 Score=Precision+Recall2⋅(Precision⋅Recall)​

Ideal when you want a balanced view.

🧠 Final Thought

The confusion matrix is one of the most important tools to understand if your model is making the right calls. Instead of just relying on "accuracy," use precision, recall, and F1-score to get a deeper insight.

🧪 Refer GitHub repo here —> https://github.com/pradipmalge/ml-diaries/blob/main/challenge/day00/ConfusionMatrix/confusion_matrix.ipynb

0
Subscribe to my newsletter

Read articles from Pradeep Malage directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Pradeep Malage
Pradeep Malage