AdaBoost

aditi mishraaditi mishra
3 min read

AdaBoost (Adaptive Boosting) is an ensemble learning technique primarily used for classification problems. It combines multiple weak learners (typically decision trees with a single decision stump) to form a strong classifier. The models are trained sequentially, with each subsequent model focusing more on the errors made by the previous one.

In AdaBoost, each data point is initially assigned an equal weight of 1/N, where N is the total number of data points. These weights help determine the importance of each sample in training. After each iteration, the misclassified points receive higher weights, making them more influential in the training of the next weak learner. This process continues until a stopping criterion is met, such as a predefined number of weak learners or minimal classification error. Finally, the weak learners are combined through a weighted majority vote to make predictions.

Geometric Intuition :-

AdaBoost refines classification boundaries by iteratively focusing on misclassified points. Initially, all data points have equal weight, and the first weak classifier (e.g., a decision stump) creates a rough decision boundary. Misclassified points are then assigned higher weights, effectively pulling the decision boundary toward them in the next iteration. Each new weak classifier adjusts accordingly, progressively improving the separation between classes.

The final model is a weighted combination of all weak classifiers, forming a strong classifier with a well-refined decision boundary. Geometrically, this process "warps" the data space to enhance classification accuracy.

AdaBoost represents class labels as +1 and -1 instead of 0 and 1 to simplify calculations.

  1. Easier Weight Updates

    • The model adjusts weights using e^−αyf(x), where y (true label) and f(x) (prediction) are +1 or -1.

    • If correctly classified, the weight decreases; otherwise, it increases.

  2. Simplifies Final Prediction

    • The strong classifier is a weighted sum: F(x)=∑αf(x).

The final prediction is simply sign(F(x)), making decisions straightforward.

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X, y = data.data, data.target

# Convert to binary classification (only two classes)
y = (y != 0).astype(int)  

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create AdaBoost model with a weak learner (decision stump)
base_model = DecisionTreeClassifier(max_depth=1)
adaboost = AdaBoostClassifier(base_estimator=base_model, n_estimators=50, learning_rate=1.0, random_state=42)

# Train the model
adaboost.fit(X_train, y_train)

# Make predictions
y_pred = adaboost.predict(X_test)

# Print accuracy
print("Accuracy:", accuracy_score(y_test, y_pred))

AdaBoost Hyperparameters :-

1. n_estimators: It decides how many weak models (like small decision trees) are used. More models can improve accuracy but may also cause overfitting.

2. learning_rate: It controls how much each weak model contributes to the final result. A smaller value makes learning slower but can improve generalization.

3. base_estimator: It is the type of weak model used, usually a decision tree with one split. You can change it to other models like logistic regression.

4. algorithm: It determines how sample weights are updated, with "SAMME" for multi-class problems and "SAMME.R" using probabilities for better performance.

5. random_state: It sets a fixed seed to ensure the results remain the same every time the model runs.

Advantages and Disadvantages of using AdaBoost:-

Advantages of AdaBoost

  1. Improves Weak Learners – Converts weak models (e.g., decision stumps) into a strong classifier.

  2. Feature Importance – Helps identify important features in the dataset.

  3. No Need for Feature Scaling – Works well without normalizing data.

  4. Simple and Interpretable – Easy to understand and implement.

  5. Less Prone to Overfitting – If the number of estimators is tuned properly.

Disadvantages of AdaBoost

  1. Sensitive to Noisy Data – Misclassified points get higher weights, which can amplify noise.

  2. Needs Careful Parameter Tuning – Too many weak learners can lead to overfitting.

  3. Weak Against Overlapping Classes – Performs poorly when classes are not well-separated.

  4. Computationally Expensive – Training many weak models can be slow.

0
Subscribe to my newsletter

Read articles from aditi mishra directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

aditi mishra
aditi mishra