Titans Memory

Creator
Creator
Seonglae Cho
Created
Created
2025 Jan 18 13:46
Editor
Edited
Edited
2025 Mar 16 17:33

Learning to Memorize at Test Time

They focus on Long term
Associative Memory
in which we aim to store the past data as the pairs of keys and values. Similar to Transformers, they use two linear layers to project xtx_t into a key and value.
kt=xtWK,vt=xtWVl(Mt1;xt)=Mt1(kt)vt2k_t = x_tW_K , v_t = x_tW_V \\ l(M_{t-1};x_t) =||M_{t-1}(k_t) - v_t||^2
Next, they expect our memory module to learn the associations between keys and values. They define the loss as MSE between values and constructed values from the keys.
Accordingly, in the inner loop, we optimize MM’s weights, while in the outer-loop, we optimize other parameters (WK,WVW_K, W_V) of the entire architecture.

Adaptive Forgetting

When dealing with very large sequences, it is crucial to manage which past information should be forgotten. Adaptive Forgetting mechanism that allows the memory to forget the information that is not needed anymore, resulting in better managing the memory’s limited capacity.
Mt=(1αt)Mt1+StM_t = (1 - \alpha_t)M_{t-1} + S_t
where αt[0,1]\alpha_t \in [0,1] is the gating mechanism that flexibly controls the memory.
 
 
When dealing with very large sequences, it is crucial to manage which past information should be forgotten. Adaptive Forgetting mechanism that allows the memory to forget the information that is not needed anymore, resulting in better managing the memory’s limited capacity.
Mt=(1αt)Mt1+StM_t = (1 - \alpha_t)M_{t-1} + S_t
where αt[0,1]\alpha_t \in [0,1] is the gating mechanism that flexibly controls the memory.
notion image

Memory as a Context

notion image

Memory as a Gate

notion image

Memory as a Layer

notion image

LMM (Long term memory only model)

notion image
NIAH says LMM is the best since it only focuses on that
notion image
 
 

Persistent Memory

 
 
 
 
 

Recommendations