Towards Scalable Parameter Decomposition
The most successful methods so far, like SAEs, have focused largely on the activations that flow through a model, rather than directly inspecting the weights that transform and guide the inputs into those flows. It's a bit like trying to understand a program by only looking at its runtime variables, but never its source code. Parameter decomposition offers a way to decompose a model’s parameters—the 'source code'—into components that reveal not only what the network computes, but how it computes it. Today, we're releasing a paper on Stochastic Parameter Decomposition (SPD), which removes key barriers to the scalability of prior methods.
https://www.goodfire.ai/papers/stochastic-param-decomp