Additive Mechanism with Reversing Transformer
- subject heads - extract subject using attending
- relation heads - extract relation attribute
- MLP - reinforce relation property
- Token Concatenation - Connect information to last token
- Fact Lookup - MLP linearly represent final token to attribute
- Attribute Extraction - Map attribute to output
Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1) — AI Alignment Forum
If you've come here via 3Blue1Brown, hi! If want to learn more about interpreting neural networks in general, here are some resources you might find…
https://www.alignmentforum.org/posts/iGuwZTHWb6DFY3sKB/fact-finding-attempting-to-reverse-engineer-factual-recall
New Study Finds LLMs Rely More on Recall Than Logic

Seonglae Cho