AI Reasoning

Creator
Creator
Seonglae Cho
Created
Created
2024 Apr 1 6:21
Editor
Edited
Edited
2025 Mar 27 23:11

Model Generalization

Utilizing information beyond what is given or building logical steps to reach conclusions without explicit information.

Objectives

Findings

  • Procedural knowledge in documents drives influence on reasoning traces
  • For the factual questions, the answer often shows up as highly influential, whereas for reasoning questions it does not
  • LLMs rely on procedural knowledge to learn to produce zero-shot reasoning traces
  • Evidence code is important for reasoning
AI Reasoning Types
 
 
 

Procedural Knowledge in Pretraining

We observe that code data is highly influential for reasoning. StackExchange as a source has more than ten times more influential data in the top and bottom portions of the rankings than expected if the influential data was randomly sampled from the pretraining distribution. Other code sources and ArXiv & Markdown are twice or more as influential as expected when drawing randomly from the pretraining distribution
 
 

Recommendations