To learn multi-dimensional features along multiple axes, we divided the SAE latent space into multiple groups and applied L1 regularization between groups. We reduced penalties on activations within the same group to encourage learning of multi-dimensional subspaces. While the Jaccard similarity between features within groups was high, ensuring semantic similarity, when applied to real data there was "insufficient meaningful progress" due to issues like redundancy, fragmentation, and grouping failures.
GroupSAE
Creator
Creator

Created
Created
2025 May 11 17:59Editor
Editor

Edited
Edited
2025 Jun 17 11:7Refs
Refs