SAE Feature Shrinkage

Creator
Creator
Seonglae Cho
Created
Created
2024 Oct 24 9:21
Editor
Edited
Edited
2025 Mar 6 11:58
Refs
Refs
SAEs routinely underestimate the intensity of a given feature. It happens because of the sparsity penalty during training. An SAE will underestimate a feature’s intensity because it wants to account for other features that will interfere
 
 
 

shrink

Pathological error
 
 

Recommendations