Attention-Causal Communicatio

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Dec 22 23:44
Editor
Edited
Edited
2025 Dec 22 23:49
Refs
Refs

ACC

Relative Attention + Sparse Attention Decomposition
Proposes a method to precisely isolate and track the causal features that make attention select specific token pairs, using low-dimensional signals in the QK matrix to find causal circuits with a single forward pass. Defines relative attention (a linear substitute for causal analysis purposes) instead of directly analyzing the nonlinear Softmax. Uses SVD-based decomposition of the QK matrix and selects the smallest set of terms (sparse term selection) from the relative attention expansion to identify the signal subspace. With these signals, enables circuit tracing per input in a single run, without activation patching. Signals are divided into data signals and control signals, showing that control signals cause attention sinks (concentration on start tokens, etc.).
  1. Define "relative attention" as a linear analysis quantity (not replacing Softmax in actual computation)
  1. Decompose QK matrix via SVD into "a few axes" for attention scores
  1. Select "the smallest k set" as signals (sparse term selection from relative attention expansion)
  1. Residual is precisely decomposed as sum of upstream outputs
  1. Define "linear contribution function" for relative attention
  1. Build circuit/communication graph (single forward pass)

Data signal

Signal that depends on input content. When ablated, directly degrades task performance (IOI/GT/GP, etc.)

Control signal

Signal nearly independent of input content. Present in many/all tokens and commonly reused across multiple heads. When a head doesn't need to transfer information, this signal causes attention to go to start tokens, punctuation, etc. to fill the Softmax denominator. → A control signal meaning "this head doesn't need to transfer information now," i.e., the causal origin of
Attention Sink
 
 
 
 
 
 

Recommendations