LUT-GEMM: Quantized Matrix Multiplication based on LUTs for...
The recent advancements in self-supervised learning, combined with the Transformer architecture, have enabled natural language processing (NLP) to achieve remarkably low perplexity. However,...
https://arxiv.org/abs/2206.09557