Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Risk/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Activation Decomposition/SAE/SAE Benchmark/
Feature Monosemanticity Score
Search

Feature Monosemanticity Score

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Dec 22 17:39
Editor
Editor
Seonglae ChoSeonglae Cho
Edited
Edited
2025 Dec 22 17:41
Refs
Refs

FMS

 
 
 
 
 
 
token-wise labeling required
arxiv.org
https://arxiv.org/pdf/2506.19382v1
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Risk/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Activation Decomposition/SAE/SAE Benchmark/
Feature Monosemanticity Score
Copyright Seonglae Cho