Gated Retention

Creator
Creator
Alan JoAlan Jo
Created
Created
2024 May 18 7:7
Editor
Editor
Alan JoAlan Jo
Edited
Edited
2024 May 18 7:8
Refs
Refs
Gated retention (gRet, aka gRetNet or RetNet-3) augments retention with a data-dependent gating mechanism, which achieves training parallelism, good performance, and low inference cost simultaneously for sequence modeling.
 
 
 
 
 
 
 
 

Recommendations