Bias in k-NN classifier
In the case of K-nearest neighbors (KNN) classifiers, the bias typically increases as the value of K increases. Here's why:
When K is small (e.g., K=1), the model tends to overfit the training data, as it is highly sensitive to noise and anomalies. This leads to a low bias but high variance model.
As K increases, the model becomes smoother, relying on more neighbors to make decisions. This reduces the model’s sensitivity to individual data points, making it less prone to overfitting. However, this generalization can also cause the model to make overly simplified predictions, resulting in higher bias but lower variance.
Thus, the bias of a KNN classifier increases as K increases.
"Smoother" in KNN:
In the context of KNN, "smoothness" doesn't refer to a curve or a line, since KNN is not a parametric model (like linear regression, where we have curves or lines). Instead, it refers to how decisions are made based on the surrounding data points (neighbors).
When K=1: Each point in the dataset is classified based on its closest neighbor. This makes the decision boundary highly sensitive to individual data points and noise, resulting in jagged, highly irregular decision boundaries.
When K increases: The model uses more neighbors to decide the class of a point. With a larger K, the classification of any given point depends on a larger group of neighbors. This smoothens the decision-making process, because individual outliers or noisy points don't affect the decision as much. As a result, the model’s decision boundaries become less complex and more generalized.
A "smoother" model in KNN means that, as K increases, the classifier generalizes better to larger regions rather than making local, fine-grained distinctions.
Higher Bias and Its Impact:
Bias refers to the error due to oversimplified assumptions in the model. In KNN, when we increase K, the model makes simpler decisions by considering more neighbors, which often leads to higher bias.
Higher bias means the model starts to make more generalized predictions. As K increases, it treats a point based on the majority of its neighbors, even if some subtle but important distinctions exist in smaller regions of the data. This causes the model to miss some finer patterns in the data.
Higher bias = more error from underfitting. The model might ignore complex relationships in the data because it's averaging over too many points. This means the model makes more errors (bias) because it's not as closely fitting the actual data structure.
Subscribe to my newsletter
Read articles from Kaustubh Kulkarni directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by