The architecture has separate for each layer while sharing only the latent dictionary for scaling, where : source layer-specific encoder, : target layer-specific decoder, : reconstructed layer activation from source latent.
Each input-output layer pair has its own encoder-decoder weight pair. Unlike cross-layer transcoders (Circuit Tracing), encoders are not shared; instead, layers share the same latent space, achieved through loss-based approximation.
This is solved through alignment via co-training, where an alignment loss is added during training to force the latents to match, causing them to converge to a shared latent space.

Decoder sparsity loss
Crosscoders
CrossCoder (2024)
with Cross fine-tuning model & scaling transferability by diffing within same architecture
BatchTopK crosscoder to prevent Complete Shrinkage and Latent Decoupling for chat model

Seonglae Cho