Case-Based Reasoning (CBR) utilizes external memory-based learning without modifying the LLM. It remembers past successful and failed cases for reuse in similar situations. Memory-augmented MDP (M-MDP)* optimizes case selection (retrieval policy) using Soft Q-learning.
- Planner (GPT-4.1): Creates plans and retrieves cases from memory.
- Executor (o3 / o4-mini): Executes tools through MCP protocol.
- Case Memory: Stores and retrieves past trajectories (non-parametric: cosine similarity / parametric: Q-function learning).
Extends RAG + RL trends to agentic dynamic corpus.