Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Optimization/Model Quantization/Model Quantization Algorithm/GPTQ/AutoGPTQ/
AutoGPTQ Quantization
Search

AutoGPTQ Quantization

CUDA inference: issue with group_size = 1024 + desc_act = False. (Triton unaffected)
Updated 2023 Jul 9 18:41
quantize(traindataset) example are there

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Optimization/Model Quantization/Model Quantization Algorithm/GPTQ/AutoGPTQ/
AutoGPTQ Quantization
Copyright Seonglae Cho