Reasoning Compounding Error

Design that uses aggressive sampling for diversity and voting to reduce step-error, achievable even with small non-reasoning models.

Maximal Agentic Decomposition (MAD): Break tasks into single-step subtasks with microagents. First-to-ahead-by-k voting: Sample each step multiple times, vote until one answer leads by k votes. Red-flagging: Discard incorrect formatting/length outputs to reduce correlated errors.

Result: Completed 20-disk puzzle over 1,000,000+ steps with zero errors. Voting/sampling rounds show undecided steps decrease exponentially to 0.

arxiv.org

https://arxiv.org/pdf/2511.09030

Reasoning Compounding Error

Recommendations