Ability to make a Question
Research and development is generally not as economically valuable as people assume and it's significantly harder to automate R&D jobs than it might naively seem. Most AI value will come from broad automation.
Research Agents frequently utilize Evolutionary algorithm likely because this field has relatively fewer cost constraints and requires maximizing diverse approaches in method exploration.
AI Research Agents
AI Research Agent Benchmarks
AI Research Tools
Software Security
Google's 'Big Sleep' AI Project Uncovers Real Software Vulnerabilities
The company's experimental AI agent finds a previously unknown and exploitable software bug in SQLite, an open-source database engine.
https://uk.pcmag.com/ai/155143/googles-big-sleep-ai-project-uncovers-real-software-vulnerabilities

Google AI co-scientist multi agent system with Elo Score with self-improvement


Accelerating scientific breakthroughs with an AI co-scientist
Juraj Gottweis, Google Fellow, and Vivek Natarajan, Research Lead
https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/

Automating Scientific research experiments (future house)
Meet the Humans Building AI Scientists
A look inside FutureHouse, a nonprofit research institute in San Francisco.
https://press.asimov.com/articles/futurehouse

Context engineering (not diffusion)
Research contexts are complex and non-linear, making Context Engineering difficult
- Developer agent
- Experimenter agent - Jupyter notebook access (Juupyter MCP)
You and Your Research Agent: Lessons From Using Agents for Interpretability Research
We’ve been using AI agents to assist with interpretability research for the past several months. In this post, we’re sharing some of the lessons we’ve learned: how agents for experimentation are different than agents for software development, where they perform well, and where they still fall short. We’re also open sourcing a basic implementation of the most important tool we’ve found for enabling effective agentic experimentation — a general-purpose Jupyter server/notebook MCP package — and a suite of interpretability tasks that can be run using the tool.
https://www.goodfire.ai/blog/you-and-your-research-agent

Shallow loop limitation
Agents 2.0: From Shallow Loops to Deep Agents
An overview of the architectural shift from Shallow Agents (Agent 1.0) to Deep Agents (Agent 2.0) and how to build complex AI agents that can handle multi-step tasks over extended periods.
https://www.philschmid.de/agents-2.0-deep-agents

Seonglae Cho

