FlexAttention

Creator

Creator

Seonglae Cho

Created

Created

2024 Aug 13 13:45

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Aug 13 13:51

Refs

Refs

pytorch-labs • Updated 2024 Aug 13 13:45

Score modification

notion image

ALiBi bias

Similar to

Relative Positional Encoding but per-head factor that is typically precomputed and has beneficial properties for length extrapolation at inference

Soft-capping

FlexAttention is currently available in PyTorch nightly releases, we plan to release it as a prototype feature in 2.5.0

FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention

Run PyTorch locally or get started quickly with one of the supported cloud platforms

https://pytorch.org/blog/flexattention/

FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention

Recommendations

//////////