Exchanging latent features across different size of SAEs
Reconstruction latent (SAE Feature Splitting, SAE Feature Absorption)
If performance degrades or remains unchanged after adding it, that latent is judged to be a more detailed representation of latents already present in the smaller model, which we call a reconstruction latent
Latent Novel
A latent novel is identified when adding individual latents from a larger model to a smaller model improves reconstruction performance (e.g., MSE), indicating that these latents contain new information not present in the smaller model.