NeuronEval

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jul 1 14:54
Editor
Edited
Edited
2025 Jul 1 17:30
All 18 metrics in NeuronEval evaluate the "faithfulness" of explanations by measuring how well predicted activations match actual unit (neuron/
SAE Feature
) activation patterns. Introduction of two sanity checks: "missing label test" and "excessive label test" for 18 evaluation metrics. Out of these 18 stand-alone metrics, only five passed both sanity checks (missing label test and excessive label test): F1-score, IoU, Pearson Correlation, Cosine Similarity, and AUPRC are recommended as "reliable" metrics.
notion image
notion image

Datasets used to compare "which of the 18 metrics work well"

  1. (Vision)
      • ImageNet (1,000 classes)
      • Places365 (365 place categories)
      • CUB-200-2011 (200 bird species dataset; 112 detailed attribute feature labels)
  1. Language
      • OpenWebText (for GPT-2 model evaluation, limited to 500 frequently occurring tokens)
 
 
 
 

Recommendations