A neural network component that projects input to a sparse dimensional space and reconstructs the output
Per-layer transcoder
Transcoders
Transcoder
Transcoders Beat Sparse Autoencoders for Interpretability
- Narrower interpretation distribution and stronger monosemantic (single-meaning feature activation) characteristics.
- Sparse Probing performance similar to or slightly better than SAE.
Skip Transcoder can replace SAE for Residual Stream (when Identity skip is added).

Seonglae Cho