ACC
Relative Attention + Sparse Attention Decomposition
Proposes a method to precisely isolate and track the causal features that make attention select specific token pairs, using low-dimensional signals in the QK matrix to find causal circuits with a single forward pass. Defines relative attention (a linear substitute for causal analysis purposes) instead of directly analyzing the nonlinear Softmax. Uses SVD-based decomposition of the QK matrix and selects the smallest set of terms (sparse term selection) from the relative attention expansion to identify the signal subspace. With these signals, enables circuit tracing per input in a single run, without activation patching. Signals are divided into data signals and control signals, showing that control signals cause attention sinks (concentration on start tokens, etc.).
- Define "relative attention" as a linear analysis quantity (not replacing Softmax in actual computation)
- Decompose QK matrix via SVD into "a few axes" for attention scores
- Select "the smallest k set" as signals (sparse term selection from relative attention expansion)
- Residual is precisely decomposed as sum of upstream outputs
- Define "linear contribution function" for relative attention
- Build circuit/communication graph (single forward pass)
Data signal
Signal that depends on input content. When ablated, directly degrades task performance (IOI/GT/GP, etc.)
Control signal
Signal nearly independent of input content. Present in many/all tokens and commonly reused across multiple heads. When a head doesn't need to transfer information, this signal causes attention to go to start tokens, punctuation, etc. to fill the Softmax denominator. → A control signal meaning "this head doesn't need to transfer information now," i.e., the causal origin of Attention Sink

Seonglae Cho