Crosscoder

Creator
Creator
Seonglae Cho
Created
Created
2024 Nov 2 0:14
Editor
Edited
Edited
2025 Feb 27 16:11
The architecture has separate Wenc,WdecW_{enc}, W_{dec} for each layer while sharing only the latent dictionary for scaling
https://transformer-circuits.pub/2024/crosscoders/index.html

Acausal crosscoder

 
Crosscoders
 
 
Transcoders
 
 

Transfer Learning
across layers

By leveraging shared representations between adjacent layers, training costs and time can be significantly reduced by applying transfer learning instead of training Sparse AutoEncoder (SAE) from scratch. Backward was better than forward, which can be understood as starting with prior knowledge of computation results.
  • forward SAE
  • backward SAE
 
 

Recommendations