Infini Transformer

Created
Created
2024 Apr 15 15:26
Creator
Creator
Seonglae ChoSeonglae Cho
Editor
Edited
Edited
2024 Apr 17 2:53

Model-Internal RAG implementing
Working memory
similar

1 mio token context window tested, but there is no upper limit
  • Utilizes standard local attention mechanisms found in transformers.
  • Integrates a global attention mechanism through a compression technique.
  • Merges both local and global attention to manage extended contexts efficiently.
 
 
 
 
 

Recommendations