AI Arithmetic Reasoning

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Sep 29 23:29
Editor
Edited
Edited
2025 Nov 1 16:55
Refs
 
 
 
 
 

AI Counting (linebreaking)

notion image
In fixed-width text, the model must predict \n by determining whether the next word exceeds the line boundary. The model represents character count (current line length), total line width, remaining character count, next word length, etc. on a 1-dimensional "feature
Manifold
". This manifold takes a
Helix
shape embedded in a low-rank subspace (≈6D) of the high-dimensional residual stream, with features tiling the manifold. QK rotation (attention) performs boundary detection by rotating and aligning one manifold to another (line width representation). Multiple boundary heads cooperate with different offsets to estimate remaining character count at high resolution. Next word length and remaining character count are represented in nearly orthogonal subspaces, making "line break decision" linearly separable.
Gibbs phenomenon
(ringing) creates a rippled manifold—an oscillatory residual pattern (overshoot–undershoot oscillation such as
Moire
and
Aliasing
) that emerges when projecting signals from high to low dimensions or approximating continuous values in finite dimensions.

Debugging why LLM thinks 9.11 > 9.8

Attention tracks relationships between numbers, pattern matching, while MLP performs calculations

Limitation

패턴매칭이라 진정한 논리적이나 수학적 추론이 아니라 같은 문제라도 수치에 따라 정답률 다름
질문에 불필요한 정보가 추가될 때, LLM은 이 정보를 무시하지 못하고 성능이 크게 감소한다
arithmetic is important for world modeling
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis
 
 

Recommendations