SAE aims to solve the Bilevel optimization problem where the outer optimization minimizes reconstruction error and sparsity regularization, while the inner optimization finds optimal projection values for the encoder given a constraint set.
Duality
Fundamental duality between how concepts are organized in model representations versus how an SAE encoder’s receptive fields should be structured to optimally identify said concepts. Crucially, this implies any SAE is implicitly biased towards identifying concepts that are organized in a specific manner.
The Rate Distortion Dance between reconstruction (Compressed sensing) and sparsity (Interpretable Sparse Coding)
The Rate Distortion Dance of Sparse Autoencoders | Tilde
Overview: in this blog post, we are going to be setting some of the theoretical foundations and intuition for the problems we think about. Over the coming week, we will release different blog posts focused on specific experiments and empirical questions. As such, this post aims to lay the groundwork for what's to come. We're excited to share the tip of the iceberg!
https://www.tilderesearch.com/blog/rate-distortion-saes


Seonglae Cho