Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Optimization/Model Quantization/Model Quantization Algorithm/GPTQ/AutoGPTQ/
AutoGPTQ Triton
Search

AutoGPTQ Triton

CUDA inference: issue with group_size = 1024 + desc_act = False. (Triton unaffected)
Updated 2023 Jul 9 18:41

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Optimization/Model Quantization/Model Quantization Algorithm/GPTQ/AutoGPTQ/
AutoGPTQ Triton
Copyright Seonglae Cho