So far we’ve been trying to draw lines through chaos. Linear regression tried to predict the future with one. Logistic regression tried to make decisions with one. But eventually, even the most determined line has to give up.

So we stop drawing lines. And start asking around.

Enter k-Nearest Neighbors. It doesn’t care who you are. It decides who you are based on who you hang out with.

Not in a mean way. In a very literal one.

You hand it a new data point and it squints: “Huh, this point looks like it belongs in the 'banana' group. Why? Well, it’s sitting near five other bananas, obviously.”

No equations. No training phase. Just vibes and proximity.

If logistic regression was a courtroom drama with careful deliberation and a judge drawing boundaries - KNN is high school gossip.

And surprisingly? That gossip is often right.

So, how does the gossip work?

At the heart of KNN is a very simple question: “Who are your neighbors?”

See, KNN doesn’t care about understanding the why of your behavior. It skips the backstory. All it wants to know is who’s around you. Because if you’re surrounded by snakes, chances are - you’re one too.

It remembers everything (like the one friend who has screenshots)

KNN doesn’t have a training phase. It just stores all the data you give it - every past example, every feature value, every label. It’s basically hoarding information. When a new point comes in, it pulls up the entire group chat to find the closest match.

Who’s the closest

Now, KNN’s whole job is to figure out the closest match. But “closeness” isn’t always as obvious as it sounds.

When we say “distance,” we usually mean Euclidean - the ideal, straight-line “as the crow flies” path. Like drawing a line on Google Maps from your house to the ice cream shop and pretending you can walk through buildings.

But sometimes, we have to be realistic. You can’t walk through walls. You take turns, follow roads, avoid potholes. That’s Manhattan distance. It adds up the actual distance you walk, not the shortcut you wish you could take.

Different distance formulas work better for different kinds of data - especially when movement is restricted or your features behave differently.

It asks multiple people

You choose a number called 'K' - the number of neighbors KNN should consult. If K = 1, it only listens to the nearest point. That’s risky. It’s like taking career advice from one person (who might not even be in the same field as you)

Too low a K? Overreactive. Too high a K? Indecisive.

The sweet spot? Depends on the data. But odd numbers are preferred (in case there's a tie), and it’s usually a matter of tuning. Don’t be afraid to experiment.

Each of the K neighbors gets a vote. If the majority of them are from class A, your new point is class A. That’s it.

It can get confused if your features are loud

Imagine you’re trying to classify if someone is normal. You pick two features:

Number of alarms they’ve set to wake up in the morning
Number of unread messages they have on their phone

Now, let’s talk patterns.

Someone who sets just one alarm? Deeply suspicious.
Someone who has twenty alarms, spaced 3 minutes apart and all labeled “THIS IS SERIOUS”, “GET UP NOW” , “LEAVE NOW!!” AND “HAVE YOU LEFT?!?!” ? That’s normal.

Unread messages? If it’s zero, crazy behaviour. But if they have 183 unread WhatsApp messages and still manage to reply “lmao” within 0.2 seconds - that’s a stable kind of chaos.

Here’s the twist: if you don’t scale these features, the number with the bigger range (say, unread messages going up to 1000) will totally overpower the other. Suddenly, alarms don’t matter - even though they’re really good stability indicators.

That’s why we scale features - so they’re all on the same page. Without it, KNN listens to the loudest one -not necessarily the most important.

So before you go gossiping with KNN, do your homework:

Scale your features (standardize or normalize)
Pick a distance metric that makes sense
And don’t trust one neighbor’s opinion - ask a few

That’s how you make KNN less like a neighbourhood aunty and more like a spy.

Let’s code it

Enough talk. Time to eavesdrop.

We’re going to classify people based on chaotic energy - using two features:

Number of alarms they set in the morning
Number of unread messages they have on their phone

We’ll use KNN to figure out: is this person "normal" or "a menace to society"?

# Step 1: Import the model
from sklearn.neighbors import KNeighborsClassifier
import numpy as np

# Step 2: Gather data, format [num_of_alarms, num_of_unread_msgs]
X = np.array([
    [1, 0],     # psychotic
    [10, 350],  # normal
    [8, 220],   # normal
    [2, 5],     # psychotic
    [12, 480],  # normal
    [1, 1],     # psychotic
    [7, 300]    # normal
])

y = np.array(['psychotic', 'normal', 'normal', 'psychotic', 'normal', 'psychotic', 'normal'])

# Step 3: Feature Scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Step 4: Initialize the model
model = KNeighborsClassifier(n_neighbors=3)  # gossiping with 3 neighbors

# Step 5: Fit the model
model.fit(X_scaled, y)

# Step 6: Make a prediction
# New person: 2 alarms, 400 unread messages
new_data = scaler.transform([[2, 400]])
print(model.predict(new_data))  # Outputs: ['normal'] or ['psychotic']

Here, we’ve we used StandardScaler. It transforms your features so they each have:

Mean = 0
Standard deviation = 1

This basically makes every feature speak in the same voice. No feature shouts. No feature whispers.

Here's what .fit_transform() does:

fit() calculates the mean and std deviation for each feature
transform() applies the standardization formula to squash everything into similar scales

And don’t forget - any new data must be scaled the same way using .transform(). Not fit_transform() again. Just .transform(). You don’t want to change the rules halfway through gossiping.

Other scalers that we can use are:

MinMaxScaler: squishes all values between 0 and 1. Great when you know your data has clear limits (like percentages or pixel values).
RobustScaler: ignores outliers (uses median and IQR instead of mean/std). Handy when your data has some rebellious spikes.
Normalizer: scales rows instead of columns. Less common in classification, more in things like text vectors.

When does the gossip go wrong

KNN is great at reading the room, but it has it’s moments of unreliability.

Curse of Dimensionality

As the number of features (dimensions) increases, the distance between data points becomes less meaningful.

When we reduce high dimensional data into low dimensions, everything starts to look confusing- like trying to figure out who's closest to you at a packed music festival from a drone shot.

This dilutes KNN’s ability to find truly “near” neighbors. Suddenly, your neighbors aren’t that relevant anymore - they just happened to be floating in the same space.

Noisy Neighbors

KNN assumes your neighbors are reliable, but what if your training data is full of mislabeled points or outliers? Then you’ve just asked the wrong people for advice.

One bad label can throw off the vote - especially if K is small. And since KNN doesn’t learn patterns (it memorizes), it can’t filter out the nonsense.

Imbalanced Classes

If one class has way more data points than the other, it will dominate the neighborhood. Imagine trying to classify bananas, but 95% of your dataset is oranges- KNN will almost always guess “oranges” just based on sheer numbers.

Costly Predictions

Since KNN doesn’t actually “learn,” it stores all training data and does the work at prediction time. That means every new prediction involves computing distances from every stored point.

Great for small datasets. Terrible for large ones.

KNN pays attention. It remembers everything you’ve told it, and when the time comes, it makes a decision based on who you’re most similar to.

It’s not perfect. It gets overwhelmed in big crowds, trusts the wrong voices sometimes, and panics when everyone starts shouting different things. But hey don’t we all.

Still - with a little cleanup, a bit of scaling, and a good choice of K - it’s surprisingly sharp.

It doesn’t need to explain the world. It just needs to know where you stand in it.

And sometimes, that’s enough.

Next up: we're done listening to the crowd - it’s time to start asking questions. Loudly. One at a time. In a very specific order.

The line gave up, so we knocked next door