SAE Training Duality

Creator
Creator
Seonglae Cho
Created
Created
2025 Mar 10 16:5
Editor
Edited
Edited
2025 Mar 10 16:7
Refs
Refs
SAE aims to solve the
Bilevel optimization
problem where the outer optimization minimizes reconstruction error and sparsity regularization, while the inner optimization finds optimal projection values for the encoder given a constraint set.

Duality

Fundamental duality between how concepts are organized in model representations versus how an SAE encoder’s receptive fields should be structured to optimally identify said concepts. Crucially, this implies any SAE is implicitly biased towards identifying concepts that are organized in a specific manner.
 
 
 
 
The Rate Distortion Dance between reconstruction (
Compressed sensing
) and sparsity (
Interpretable Sparse Coding
)
 
 

Recommendations