LUT-GEMM: Quantized Matrix Multiplication based on LUTs for...The recent advancements in self-supervised learning, combined with the Transformer architecture, have enabled natural language processing (NLP) to achieve remarkably low perplexity. However,...https://arxiv.org/abs/2206.09557