Embedding based Scoring
Compares the Text embedding between the top/least activation samples with the explanation text based on Cosine Similarity.
It serves as a good metric for evaluating feature performance and improving the linear mismatch between low and high activations.
arxiv.org
https://arxiv.org/pdf/2410.13928

Seonglae Cho