AWQ

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Nov 2 15:37
Editor
Edited
Edited
2023 Dec 9 4:3
Refs
Refs
GPTQ

Activation-aware Weight Quantization

AWQ doesn’t quantize all the weights in a model, and instead, it preserves a small percentage of weights that are important for LLM performance. This significantly reduces quantization loss such that you can run models in 4-bit precision without experiencing any performance degradation.
AWQ Usages
 
 
 
 
 

Recommendations