In Reasoning Models, increasing test-time computations (thinking tokens) doesn't always lead to improvement, and reverse scaling where accuracy actually decreases has been observed across multiple tasks
Test-time Scaling
Creator
Creator
Seonglae ChoCreated
Created
2025 Aug 1 23:31Editor
Editor
Seonglae ChoEdited
Edited
2025 Aug 1 23:31Refs
Refs
