FlexAttention

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Aug 13 13:45
Editor
Edited
Edited
2024 Aug 13 13:51
Refs
Refs

Score modification

notion image

ALiBi bias

Similar to
Relative Positional Encoding
but per-head factor that is typically precomputed and has beneficial properties for length extrapolation at inference
 

Soft-capping

FlexAttention is currently available in PyTorch nightly releases, we plan to release it as a prototype feature in 2.5.0
 
 
 
 

Recommendations