Agent Workflow Memory

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 Feb 16 12:13
Editor
Edited
Edited
2026 Feb 16 12:22
Refs
Refs
Extracts reusable sub-routines (workflows) from agent trajectories and stores them in memory, then reuses these workflows as guides for future tasks.
Long-horizon agents like web navigation struggle with: (1) inability to reliably follow long action trajectories, (2) overfitting to examples (in-context/training) making them fragile to context/environment changes, (3) solving each task independently without cumulative learning from past successes/failures.
  • Offline AWM: Induces workflows from training examples → uses them as memory during testing.
  • Online AWM: While streaming through test cases, filters only successful experiences via evaluator → induces workflows from them to continuously update memory (snowballing increasingly complex workflows).
WebArena (online only): Success rate significantly increased vs BrowserGym (paper reports +12.0%p, +51.1% relative), average steps also decreased (more efficient). Reports higher overall SR than human workflows (SteP). Mind2Web: Improved step SR / task SR in cross-task settings. Particularly large improvement in element selection accuracy ("which UI element to click" is well-guided by workflows). The approach of extracting reusable abstract sub-routines from examples and accumulating them in memory showed particular strengths in online adaptation and distribution gaps.
 
 
 
 
arxiv.org
 
 

Recommendations