

FlashDecoding++
Paper page - FlashDecoding++: Faster Large Language Model Inference on GPUs
Join the discussion on this paper page
https://huggingface.co/papers/2311.01282
PyTorch
An open source machine learning framework that accelerates the path from research prototyping to production deployment.
https://pytorch.org/blog/flash-decoding/


Seonglae Cho