SAE Feature Stitching

Creator
Creator
Seonglae Cho
Created
Created
2025 Mar 7 17:15
Editor
Edited
Edited
2025 Mar 8 12:24
Refs
Refs

Exchanging latent features across different size of SAEs

Reconstruction latent (
SAE Feature Splitting
,
SAE Feature Absorption
)

If performance degrades or remains unchanged after adding it, that latent is judged to be a more detailed representation of latents already present in the smaller model, which we call a reconstruction latent

Latent Novel

A latent novel is identified when adding individual latents from a larger model to a smaller model improves reconstruction performance (e.g., MSE), indicating that these latents contain new information not present in the smaller model.
 
 
 
 
 
 
 

Recommendations