Computational subgraph of a neural network.
In high level, Circuits are collections of features connected by weights that implements algorithms (Chris Olah)
The limitation of circuitry analysis is that it tends to focus only on single circuits or individual mechanisms, and in the case of attention, because it operates as an additive mechanism independently for each head, it is difficult to explain all complex mechanisms through interactions between heads alone. Therefore, the trend is moving towards independent functions of attention heads or SAEs themselves rather than circuits.
AI Circuit Notion
On the Biology of a Large Language Model
We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.
https://transformer-circuits.pub/2025/attribution-graphs/biology.html

Isolating circuit paths
Circuits Updates - April 2024
We report a number of developing ideas on the Anthropic interpretability team, which might be of interest to researchers working actively in this space. Some of these are emerging strands of research where we expect to publish more on in the coming months. Others are minor points we wish to share, since we're unlikely to ever write a paper about them.
https://transformer-circuits.pub/2024/april-update/index.html#circuit-path-lengths
society of thought
Reasoning Model shows that the perspective of models simulating an internal structure where multiple viewpoints interact has more explanatory power than the conventional view that attributes improved model performance to "longer chain-of-thought." Experimental results demonstrate that even when only rewarding correct answers, models spontaneously develop conversational behaviors (such as questioning and perspective shifts), and using conversational scaffolding in fine-tuning leads to faster improvements in reasoning performance.

Seonglae Cho