DR Tulu
Fully open training recipe for long-form, tool-based deep research agents. End-to-end learning of planning, retrieval, synthesis, and citation via SFT + RLER (Reinforcement Learning with Evolving Rubrics). Automatic selection of MCP-based multi-tools (web search, browsing, paper search), with grounded citations in responses. RLER provides rewards for each reasoning step using LLM-as-a-judge with evolving rubrics.
DR Tulu: An open, end-to-end training recipe for long-form deep research | Ai2
We introduce Deep Research Tulu (DR Tulu), an open post-training recipe and framework for long-form deep research agents.
https://allenai.org/blog/dr-tulu


Seonglae Cho