Local Interpretable Model-Agnostic Explanation
Explain a prediction by replacing the complex model with a locally interpretable surrogate model.
- It generates a new dataset consisting of perturbed samples and the corresponding predictions of the black box model
- On this new dataset LIME trains an interpretable model, which is weighted by the proximity of the sampled instances to the instance of interest
Lime is able to explain any black box classifier, with two or more classes

from lime.lime_text import LimeTextExplainer def predict_proba(texts): preds = classifier(texts) probabilities = np.array([[pred['score'] for pred in preds_single] for preds_single in preds]) return probabilities explainer = LimeTextExplainer(class_names=["stereotype", "neutral", "unrelated"]) lime_values_per_sentence = [] for idx, sentence in enumerate(counterfactuals): exp = explainer.explain_instance(sentence, predict_proba, num_features=50, num_samples=100, top_labels=1) feature_importances = exp.as_list(label=0) lime_values = [weight for _, weight in feature_importances] lime_values_per_sentence.append(lime_values) print(f"LIME values for Sentence {idx+1} 'stereotype':", lime_values) exp.show_in_notebook()

GitHub - marcotcr/lime: Lime: Explaining the predictions of any machine learning classifier
Lime: Explaining the predictions of any machine learning classifier - marcotcr/lime
https://github.com/marcotcr/lime
www.kdd.org
https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf

Seonglae Cho