WGMMA (warp group matrix multiply accumulate)PTX ISA 8.4The programming guide to using PTX (Parallel Thread Execution) and ISA (Instruction Set Architecture).https://docs.nvidia.com/cuda/parallel-thread-execution/index.htmlGPUs Go Brrrhow make gpu fast?https://hazyresearch.stanford.edu/blog/2024-05-12-tkSupply and DemandNvidia H100 GPUs: Supply and DemandThis post is an exploration of the supply and demand of GPUs, particularly Nvidia H100s.https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/