Introducing PrunedTree: Smarter Decision Trees with Automatic Depth Pruning

Motivation:
Decision Trees are a popular choice in machine learning due to their interpretability and ease of use. However, they often suffer from a major issue: overfitting.
Most practitioners/Data scientists manually tune the max_depth hyperparameter — either via trial and error or grid search.
That’s where I thought, “Why not automate this step?”
So, I built and published prunetree, a Python package that provides:
A drop-in DecisionTreeClassifier with built-in automatic pruning based on validation accuracy.
What Is prunetree?
PrunedDecisionTreeClassifier is a scikit-learn compatible estimator that:
Iteratively trains trees from depth 1 up to a max limit
Evaluates each on validation data
Selects the depth with the highest validation accuracy
Fits the final model using that depth
All you have to do is:
“pip install prunetree”
How to Use It:
Here’s how you can use it with your training and testing split:
from prunetree import PrunedDecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
clf = PrunedDecisionTreeClassifier(
prune=True,
validation_data=(X_test, y_test),
random_state=42
)
clf.fit(X_train, y_train)
print(“Best depth selected:”, clf.best_depth)
print(“Test accuracy:”, clf.score(X_test, y_test))
Image depiction of the node and accuracy score pruning:
When to Use It
Use PrunedDecisionTreeClassifier when:
You want a clean, fast Decision Tree without manually tuning max_depth
You have a fixed validation/test split
You prefer simplicity over full-blown grid search
Think of it as a decision tree that tunes itself (at least for depth!)
Why I Open Sourced It?
I built this as a personal project to learn Python packaging and real-world ML tooling. But then I realized others might find it useful too.
So, I made it open source:
What’s Next?
I’m planning to:
Add cross-validation based pruning
Add support for regression trees
Integrate into pipelines and grid search
Final Words:
If you’re someone who loves clean machine learning workflows or just hates tuning max_depth manually -> give prunetree a try.
Happy learning and keep building.
– Arun Sundar K
Subscribe to my newsletter
Read articles from Arun Sundar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
