LIME : Explaining Machine Learning Models with Confidence

Tushar AggarwalTushar Aggarwal
7 min read

Machine learning models have become increasingly complex and accurate over the years, but their opacity remains a significant challenge. Understanding why a model makes a particular prediction is crucial for building trust and ensuring that it behaves as expected. In this article, we will explore the power of LIME, a popular library that helps explain the inner workings of machine learning classifiers.

Introduction to LIME

LIME (Local Interpretable Model-agnostic Explanations) is a powerful Python library that aids in explaining what machine learning classifiers (or models) are doing. LIME’s primary purpose is to provide interpretable, human-readable explanations for individual predictions made by complex ML models. By offering a detailed understanding of how these models operate, LIME encourages trust in machine learning systems.

As ML models become increasingly complex, it can be challenging to understand their inner workings. LIME addresses this issue by creating local explanations for specific instances, making it easier for users to comprehend and trust ML models.

Why LIME is Important

The importance of LIME in the world of machine learning cannot be overstated. As ML models play an increasingly significant role in critical decision-making processes, it’s essential to be able to trust their outputs. LIME enables users to:

  1. Understand the predictions of complex ML models by creating simple, interpretable explanations.

  2. Identify potential biases and errors in the model by inspecting individual predictions.

  3. Improve model performance by understanding the features that contribute to accurate predictions.

  4. Boost user trust in the ML system by providing transparency and interpretability.

Understanding LIME’s Workflow

LIME operates by approximating the complex ML model with a simpler, locally interpretable model around a specific instance. The main steps of LIME’s workflow are:

  1. Select an instance to be explained.

  2. Perturb the instance by generating a set of neighboring samples.

  3. Obtain predictions for the perturbed samples using the complex ML model.

  4. Fit a simpler, interpretable model (e.g., linear regression or decision tree) to the perturbed samples and their predictions.

  5. Interpret the simpler model to provide an explanation for the original instance.

Installing LIME

Before you can start using LIME, you’ll need to install it. You can install LIME using pip:

pip install lime

Using LIME with Different Machine Learning Models

LIME is model-agnostic, meaning it can be used with various ML models, including classification, regression, text, and image models. In this section, we will cover how to use LIME with each type of model.

Classification Models

To use LIME with a classification model, you need to create an explainer object and then generate explanations for specific instances. Here’s a simple example using the LIME library with a classification model:

# Classification- Lime
import lime
import lime.lime_tabular
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier

# Load the dataset and train a classifier
data = datasets.load_iris()
classifier = RandomForestClassifier()
classifier.fit(data.data, data.target)

# Create a LIME explainer object
explainer = lime.lime_tabular.LimeTabularExplainer(data.data, mode="classification", training_labels=data.target, feature_names=data.feature_names, class_names=data.target_names, discretize_continuous=True)

# Select an instance to be explained (you can choose any index)
instance = data.data[0]

# Generate an explanation for the instance
explanation = explainer.explain_instance(instance, classifier.predict_proba, num_features=5)

# Display the explanation
explanation.show_in_notebook()

Output

Regression Models

Using LIME with a regression model is similar to using it with a classification model. You’ll need to create an explainer object and then generate explanations for specific instances. Here’s an example using the LIME library with a regression model:

#Regression - Lime
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from lime.lime_tabular import LimeTabularExplainer

# Generate a custom regression dataset
np.random.seed(42)
X = np.random.rand(100, 5)  # 100 samples, 5 features
y = 2 * X[:, 0] + 3 * X[:, 1] + 1 * X[:, 2] + np.random.randn(100)  # Linear regression with noise

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a simple linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Initialize a LimeTabularExplainer
explainer = LimeTabularExplainer(training_data=X_train, mode="regression")

# Select a sample instance for explanation
sample_instance = X_test[0]

# Explain the prediction for the sample instance
explanation = explainer.explain_instance(sample_instance, model.predict)

# Print the explanation
explanation.show_in_notebook()

Output

Text Models

LIME can also be used to explain predictions made by text models. To use LIME with a text model, you’ll need to create a LIME text explainer object and then generate explanations for specific instances. Here’s an example using the LIME library with a text model:

# Text Model - Lime
import lime
import lime.lime_text
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.datasets import fetch_20newsgroups

# Load a sample dataset (20 Newsgroups) for text classification
categories = ['alt.atheism', 'soc.religion.christian']
newsgroups_train = fetch_20newsgroups(subset='train', categories=categories)

# Create a simple text classification model (Multinomial Naive Bayes)
tfidf_vectorizer = TfidfVectorizer()
X_train = tfidf_vectorizer.fit_transform(newsgroups_train.data)
y_train = newsgroups_train.target
classifier = MultinomialNB()
classifier.fit(X_train, y_train)

# Define a custom Lime explainer for text data
explainer = lime.lime_text.LimeTextExplainer(class_names=newsgroups_train.target_names)

# Choose a text instance to explain
text_instance = newsgroups_train.data[0]

# Create a predict function for the classifier
predict_fn = lambda x: classifier.predict_proba(tfidf_vectorizer.transform(x))

# Explain the model's prediction for the chosen text instance
explanation = explainer.explain_instance(text_instance, predict_fn)

# Print the explanation
explanation.show_in_notebook()

Output

Image Models

LIME can explain predictions made by image models as well. To use LIME with an image model, you’ll need to create a LIME image explainer object and then generate explanations for specific instances. Here’s an example using the LIME library with an image model:

import lime
import lime.lime_image
import sklearn


# Load the dataset and train an image classifier
data = sklearn.datasets.load_digits()
classifier = sklearn.ensemble.RandomForestClassifier()
classifier.fit(data.images.reshape((len(data.images), -1)), data.target)
# Create a LIME image explainer object
explainer = lime.lime_image.LimeImageExplainer()
# Select an instance to be explained
instance = data.images[0]
# Generate an explanation for the instance
explanation = explainer.explain_instance(instance, classifier.predict_proba, top_labels=5)

Interpreting LIME’s Output

After generating an explanation using LIME, you can visualize the explanation to understand the contribution of each feature towards the prediction. In the case of tabular data, you can use the show_in_notebook or as_pyplot_figure methods to display the explanation. For text and image data, you can use the show_in_notebook method to display the explanation.

By understanding the contributions of individual features, you can gain insights into the model’s decision-making process and identify potential biases or issues.

Advanced Techniques in LIME

LIME offers several advanced techniques for improving the quality of explanations and tailoring them to your specific use case. Some of these techniques include:

  1. Tuning the number of perturbed samples: Increasing the number of perturbed samples can improve the stability and accuracy of the explanations.

  2. Selecting the interpretable model: Choosing an appropriate interpretable model (e.g., linear regression, decision tree) can impact the quality of explanations.

  3. Feature selection: Customizing the number of features used in the explanation can help focus on the most important contributors to the prediction.

Limitations and Alternatives to LIME

While LIME is a powerful tool for explaining machine learning models, it has some limitations:

  1. Local explanations: LIME focuses on local explanations, which may not capture the overall behavior of the model.

  2. Computationally expensive: Generating explanations using LIME can be time-consuming, especially for large datasets and complex models.

If LIME does not meet your needs, there are alternative methods for explaining machine learning models, such as SHAP (SHapley Additive exPlanations) and Anchors.

Building Trust in ML Models with LIME

By providing interpretable explanations for individual predictions, LIME can help build trust in machine learning models. This trust is critical in many industries, especially when ML models are used to make important decisions. With a better understanding of how their models work, users can confidently rely on ML systems and make data-driven decisions.

Conclusion

LIME is an invaluable tool for explaining what machine learning classifiers (or models) are doing. By offering a practical approach to understanding complex ML models, LIME enables users to trust and improve their systems. By following this practical guide, you can harness the power of LIME to explain predictions, identify potential biases and errors, and ultimately build better machine learning models.

Don’t keep this valuable resource to yourself — feel free to reshare it with your network. Let’s empower more professionals with the knowledge they need to excel in the world of data!

Newsletter DataUnboxed

Follow/Connect on Github & LinkedIn.

1
Subscribe to my newsletter

Read articles from Tushar Aggarwal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Tushar Aggarwal
Tushar Aggarwal

Data Scientist | Data & Business Analyst | Python, SQL | Medium Writer | MSc INSEEC, PARIS | RNCP Manager Level 7