Absolute Zero Reasoner
A new Self-play RL where the model creates and solves problems while learning without external data. However, the name is completely incorrect from an information theory perspective, as it actually requires minimal external information like compilers and interpreters as environments to obtain rewards through code execution.