Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Inference Tool/TGI/
TGI Quantization
Search

TGI Quantization

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Nov 15 15:17
Editor
Editor
Seonglae ChoSeonglae Cho
Edited
Edited
2023 Nov 17 9:17
Refs
Refs
Model Quantization
  • quant_linear
  • ExLLaMa
  • Quantizer

simple

 
 

Main document and quantization list

github.com
https://github.com/huggingface/text-generation-inference/blob/main/docs/source/conceptual/quantization.md

AWQ

Add AWQ quantization inference support
Updated 2023 Oct 10 7:31

GPTQ

github.com
https://github.com/huggingface/text-generation-inference/tree/main/server/text_generation_server/utils/gptq
 
 

Table of Contents
simpleMain document and quantization listAWQGPTQ

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Development/AI Inference Tool/TGI/
TGI Quantization
Copyright Seonglae Cho