Adaptive Branching
ARC-AGI2 from 23 to 30
At each step, a choice is made between "creating a new answer (breadth)" vs "refining an existing answer (depth)", as well as deciding "which LLM (model) to assign the task". This selection is made using Thompson sampling, which considers performance history to make decisions.
Each node (answer) receives a score (Reward) based on "how well this answer solved the problem"
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive...
Recent advances demonstrate that increasing inference-time computation can significantly boost the reasoning capabilities of large language models (LLMs). Although repeated sampling (i.e.,...
https://arxiv.org/abs/2503.04412

Sakana AI
Inference-Time Scaling and Collective Intelligence for Frontier AI
https://sakana.ai/ab-mcts/


Seonglae Cho