MFR (Mutual Feature Regularization)
Train two SAEs in parallel, then perform additional training to increase decoder weight similarity. Multimodal Mutual Feature Collaboration (MMCS) reduces SAE Dead Feature. Auxiliary penalty MFS was added to loss function.
arxiv.org
https://arxiv.org/pdf/2411.01220v1

Seonglae Cho