AI Reasoning Length

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jul 19 23:46
Editor
Edited
Edited
2026 Jan 6 0:22

Optimal Reasoning Length, AI Overthinking

 
 
 
 
 
Chain-of-Thought (CoT) length is not a case of "longer is better," but rather follows an inverted U-curve where accuracy initially increases but then decreases after a certain length. This indicates there is an optimal length.
If it's too short (underthinking), complex aspects can't be properly decomposed; if too long (overthinking), cumulative errors increase and performance drops. During RL training (e.g., GRPO, PPO), the average CoT length naturally converges toward becoming shorter → the reward maximization process finds the optimal length, revealing a simplicity bias.

Manifold Steering
Manifold_Steering
Aries-iaiUpdated 2025 Dec 3 9:40

LLM overthinking exists in a low-dimensional manifold of the activation space, and by aligning and intervening along it. tokens can be significantly reduced while maintaining accuracy. Manifold Steering: Estimate the low-dimensional subspace of reasoning activations using PCA, and steer only along it. Overthinking is not a single direction but a phenomenon bound to a low-dimensional manifold. Results: Token reduction of up to ~71% across math, code, and QA tasks, with accuracy maintained or slightly improved.
 
 

Recommendations