Hierarchical Reasoning Model

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Aug 1 23:13
Editor
Edited
Edited
2025 Aug 1 23:17

HRM

HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples.
HRM combines H(slow)/L(fast) two modules with different timescales (multiple L steps → 1 H step). HRM avoids premature convergence through hierarchical convergence (L converges locally → H updates context and resets L). HRM uses 1-step gradient (DEQ 1st-order approximation) + deep supervision for O(1) memory backpropagation. In essence, it combines timescale separation, hierarchical convergence, 1-step gradient, and ACT.
notion image
 
 
 
 
 
 

Recommendations