Darwin Gödel Machine

Creator
Creator
Seonglae Cho
Created
Created
2025 Jun 13 19:16
Editor
Edited
Edited
2025 Jun 13 19:23
For each modification (experiment), we perform benchmark testing for empirical validation and archive the results (knowledge accumulation). This automates the research cycle. Rather than single-goal optimization, it focuses on discovering new ideas and algorithms through exploration-research. Going beyond human-designed fixed structures, AI modifies and validates code by itself for endless self-improvement → achieving cumulative and open-ended development like scientific discovery. Open-ended exploration: Creates diverse 'stepping stones' to diversify search paths and escape local optima.

Results

Key improvements found are tool refinement, multi-patch evaluation, etc.
  • SWE-bench success rate 20% → 50%
  • Polyglot success rate 14.2% → 30.7%
While it's not exactly a fair comparison since the baseline removed self-improvement and open-ended elements, this is one of the few cases where
Evolutionary algorithm
has shown efficiency in
AI Agent
and
LLM
applications.
 
 
 
 
 

Recommendations