SAE High Frequency Latent

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Mar 20 14:30
Editor
Edited
Edited
2025 Jul 1 15:8
Refs
Refs

Hard to interpretable

 
 
 

Quadratic frequency loss

Removed them by quadratic frequency loss in batch

Bias Adaptation

Group Bias Adaptation (GBA) adaptively adjusts neuron-specific biases to capture raw frequency characteristics by matching the Target Activation Frequency (TAF). For single-group BA, we mathematically proved that when data satisfies "sparse, non-cohesive, anti-coherence" conditions and neuron count, bias range, and learning rate requirements are met, all monosemantic features can be fully restored in O(1) iterations. As a result, we confirmed superior reconstruction loss, activation sparsity, and feature consistency compared to TopK and ℓ1 in LLMs with up to 1.5B parameters.
Or they just represent fundamentally dense signals in the model's activations (dark matter)
 
 

Backlinks

SAE Loss

Recommendations