Mean max cosine similarity
[Interim research report] Taking features out of superposition with sparse autoencoders — AI Alignment Forum
We're thankful for helpful comments from Trenton Bricken, Eric Winsor, Noa Nabeshima, and Sid Black. …
https://www.alignmentforum.org/posts/z6QQJbtpkEAX3Aojj/interim-research-report-taking-features-out-of-superposition
ML2R (mean L2 Ratio)
arxiv.org
https://arxiv.org/pdf/2501.14926

Seonglae Cho