Position-aware Edge Attribution Patching
circuit) automatic discovery" challenges a common assumption in existing methods (= circuits are invariant to token position), arguing that this assumption leads to significant limitations, and proposes a position-aware pipeline to address this. Previous approaches approximated the indirect effects of edges, but this paper directly incorporates the fact that attention inherently creates cross-position connections.
Cross-position "edges" are defined at the (value, key, query) level, enabling the direct calculation of "different edge importance for each position, even within the same head" and the construction of position-specific circuits. The challenge is that real-world data has variable length and structure, making it impossible to "aggregate scores across position t". To solve this, the paper introduces schema (span labeling). In other words, it enables position-aware circuit discovery based on "semantic span position" instead of "token position".

Seonglae Cho