SAE Feature Structure

Creator
Creator
Seonglae Cho
Created
Created
2025 Jan 8 20:25
Editor
Edited
Edited
2025 Mar 6 0:8
SAE latent vectors are not independent, but rather form clusters that activate together in predictable ways. While functionally separate, there are actual dependencies, making interactions and compositional characteristics important for interpretability. This is particularly evident in smaller SAEs, and these clusters can be effectively analyzed through L0 regularization.
 
 
 
 
 
When two features frequently activate at the same time, we say they co-occur (high correlation)

Topological data analysis

Graph Modeling of SAE features displayed Relationship relevant features developing along the layers and latter layers involves more complex features.
 
 

Recommendations