R-Zero

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Aug 25 21:56
Editor
Edited
Edited
2025 Aug 25 21:57
Refs
Refs
Proposing a framework that evolves by creating and solving problems on its own without external data. A single base LLM is divided into two roles: Challenger (problem generator) and Solver (problem solver), with each being optimized through GRPO (reinforcement learning technique) and repeated in a co-evolution process.
 
 
 
 
 
 
 

Recommendations