The architecture has separate for each layer while sharing only the latent dictionary for scaling
Acausal crosscoder
Crosscoders
Transcoders
CrossCoder (2024)
with Cross fine-tuning model & scaling transferability by diffing within same architecture
Transfer Learning across layers
By leveraging shared representations between adjacent layers, training costs and time can be significantly reduced by applying transfer learning instead of training Sparse AutoEncoder (SAE) from scratch. Backward was better than forward, which can be understood as starting with prior knowledge of computation results.
- forward SAE
- backward SAE