Scaled Dot-Product Attention
FlashAttention과 Memory-efficient attention
pytorch.org
https://pytorch.org/docs/master/backends.html#torch.backends.cuda.sdp_kernel
PyTorch
An open source machine learning framework that accelerates the path from research prototyping to production deployment.
https://pytorch.org/blog/out-of-the-box-acceleration/


Seonglae Cho