Flash Decoding

Creator

Creator

Seonglae Cho

Created

Created

2023 Oct 27 4:18

Editor

Editor

Seonglae Cho

Edited

Edited

2023 Nov 4 8:30

Refs

Refs

notion image

notion image

FlashDecoding++

Paper page - FlashDecoding++: Faster Large Language Model Inference on GPUs

Join the discussion on this paper page

https://huggingface.co/papers/2311.01282

Paper page - FlashDecoding++: Faster Large Language Model Inference on GPUs

An open source machine learning framework that accelerates the path from research prototyping to production deployment.

https://pytorch.org/blog/flash-decoding/

Recommendations

///////////