HOPE

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Nov 12 11:16
Editor
Edited
Edited
2025 Nov 12 17:14
Refs
Refs

Higher-Order Persistent Engine

  • Deep Optimizers – Reinterpret Adam and similar optimizers as "deep memory modules" and propose more expressive optimizers (e.g., Deep Momentum GD)
    • Transforms momentum from a simple vector into a small MLP or nonlinear module, enabling more complex learning and memory of past gradient patterns
  • Self-Modifying Titans – Sequence models that learn their own update rules and transform themselves
    • Does this mean outer loop training of a network that projects gradients?
  • Continuum Memory System (CMS) – Generalizes short-term and long-term memory as a continuous spectrum, mimicking the brain's multi-timescale memory
    • Applies different update frequencies to MLP blocks: fast layers act like short-term memory, slow layers like long-term memory, enabling continual learning across time scales
Achieves lower perplexity and higher accuracy than existing models like Transformer, RetNet, and Titans on language modeling and common sense reasoning benchmarks (PIQA, ARC, BoolQ, etc.).
 
 
 
 
 

Recommendations