[Fixed] ‘super’ object has no attribute ‘sklearn_tags’

The moment you use Scikit-learn, you’re bound to experience cryptic errors that can confuse you. Let’s say When performing hyperparameter tuning with XGBoost using Scikit-learn’s RandomizedSearchCV, you might encounter this cryptic error:
AttributeError: 'super' object has no attribute 'sklearn_tags'
This blog dives deep into what this error means, why it occurs, and how to resolve it step by step. We’ll use a XGBRegressor using RandomizedSearchCV or Custom Estimators as an example to make the explanation relatable and practical.
Scikit-learn, a popular machine learning library in Python, uses a tagging system (__sklearn_tag) to assign properties to its estimators. This is helpful in that it identifies the capabilities and requirements of an estimator. For instance:
Pipeline Integration: The tags determine how to pass data between different pipeline components. Validation: The tags help validate input data before processing to avoid runtime errors. Supervision: Whether the model is supervised or unsupervised. The error “‘super’ object has no attribute ‘sklearn_tags’” typically occurs when Scikit-learn attempts to access this method from a custom estimator, and it is either:
Misconfigured: The sklearn_tags method is overridden incorrectly in the custom estimator. Incompatible: The custom estimator is not aligned with the version of Scikit-learn being used. Understanding the Context When using XGBoost with Scikit-learn’s RandomizedSearchCV for hyperparameter tuning, we rely on Scikit-learn’s tagging system to:
Validate the compatibility between XGBoost and Scikit-learn Ensure proper data handling in the cross-validation process Manage the parameter search efficiently Reproducing the Error Here’s a typical scenario where this error occurs when trying to tune an XGBRegressor:
Recently, I started working on a Weather Prediction System, a project requiring machine learning models to forecast temperature and precipitation. For this project, I chose to use XGBoost, a powerful gradient boosting algorithm, combined with scikit-learn for hyperparameter tuning using RandomizedSearchCV.
I used VS Code as my development environment. Here’s how I set up and ran the code.
Navigate to your desired location and create a folder: mkdir weather_prediction
cd weather_prediction Python Set up a virtual environment: python -m venv venvsource venv/bin/activate
On Windows:
venv\Scripts\activate Python Install necessary Python packages: pip install scikit-learn xgboost numpy Python Launch VS Code in the project folder: code . Python Create a file named train_model.py. Add and Run the Code Paste the code into train_model.py. from sklearn.model_selection import RandomizedSearchCV from xgboost import XGBRegressor from sklearn.datasets import make_regression import numpy as np
Generate sample regression data
X, y = make_regression(n_samples=100, n_features=10, random_state=42)
Initialize XGBoost regressor
model = XGBRegressor()
Define parameter search space
param_dist = { 'max_depth': [3, 4, 5], 'learning_rate': [0.01, 0.1], 'n_estimators': [100, 200], 'min_child_weight': [1, 3], 'subsample': [0.8, 0.9] }
Setup RandomizedSearchCV
search = RandomizedSearchCV( model, param_dist, cv=3, n_iter=4, n_jobs=-1, random_state=42 )
This line triggers the error with incompatible versions
search.fit(X, y)
Python Run the file: python train_model.py Python Encountered Error When running the code, I encountered the following error:
'super' object has no attribute 'sklearn_tags'
Python This is how i have encountered error in my Weather Prediction System
Super Object [Fixed] ‘super’ object has no attribute ‘sklearn_tags’ 4 This error occurs due to an incompatibility between XGBoost and scikit-learn versions. Specifically, the XGBoost version used did not fully support the newer scikit-learn interface.
This error typically arises when using XGBoost versions above 1.6.0 in conjunction with newer versions of scikit-learn If you want to verify the version you can check this
Option A: Use older scikit-learn
pip install "scikit-learn<1.6" pip install xgboost
Option B: Use newer versions with warning instead of error
pip install "scikit-learn>=1.6.1" pip install xgboost Python Alternatively you can print the version’s as well
import sklearn import xgboost
print(f"scikit-learn version: {sklearn.version}") print(f"XGBoost version: {xgboost.version}")
Recommended combinations:
scikit-learn < 1.6 with any XGBoost version
scikit-learn >= 1.6.1 with XGBoost >= 2.0.3
Python Also Read:
Resolving the Error
- Upgrade or Downgrade Libraries
As discussed earlier:
Upgrade XGBoost to a version >= 1.6.0 pip install --upgrade xgboost Python Or downgrade scikit-learn to version 1.0.2 pip install scikit-learn==1.0.2
Python 2.Use Latest Development Version
For the bleeding edge fixes:
pip install git+https://github.com/dmlc/xgboost.git Python 3. Alternative: Manual Hyperparameter Search
Instead of downgrading or upgrading, you can directly bypass the issue by adding the Hyperparameter Search method to work independently. This involves specifying the without relying on the sklearn_tags mechanism.
Manual Hyperparameter Search: Instead of relying on RandomizedSearchCV, the code manually iterates through all possible combinations of hyperparameters using the product function from itertools
Model Evaluation: For each hyperparameter combination, the model is trained and evaluated using mean squared error (MSE).
Best Parameters: After evaluating all combinations, the best parameters are stored and printed along with the best score.
from xgboost import XGBRegressor from sklearn.datasets import make_regression from sklearn.metrics import mean_squared_error from itertools import product
Step 1: Generate Sample Data
X, y = make_regression(n_samples=100, n_features=10, random_state=42)
Step 2: Initialize XGBoost Regressor
model = XGBRegressor()
Step 3: Define Parameter Search Space
param_dist = { 'max_depth': [3, 4, 5], 'learning_rate': [0.01, 0.1], 'n_estimators': [100, 200], 'min_child_weight': [1, 3], 'subsample': [0.8, 0.9] }
Step 4: Manually Perform Hyperparameter Search
best_score = float('inf') best_params = None
Create all combinations of hyperparameters
param_combinations = product( param_dist['max_depth'], param_dist['learning_rate'], param_dist['n_estimators'], param_dist['min_child_weight'], param_dist['subsample'] )
Step 5: Loop Through All Combinations
for params in param_combinations: model.set_params( max_depth=params[0], learning_rate=params[1], n_estimators=params[2], min_child_weight=params[3], subsample=params[4] )
# Step 6: Train the Model model.fit(X, y)
# Step 7: Evaluate the Model Using Mean Squared Error predictions = model.predict(X) score = mean_squared_error(y, predictions)
# Step 8: Track the Best Hyperparameters and Score if score < best_score: best_score = score best_params = params
Step 9: Display Best Parameters and Best Score
print("Best Parameters:", best_params) print("Best Score:", best_score)
Python Super Object [Fixed] ‘super’ object has no attribute ‘sklearn_tags’ 5 Conclusion
In my journey with the Weather Prediction System, I faced and resolved this error, learning about compatibility issues and their solutions. Whether by upgrading/downgrading libraries or using Hyperparameter Search, this challenge added valuable insights to my development process. I hope this guide helps you address similar challenges!
Subscribe to my newsletter
Read articles from sampath directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
