Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Industry/AI Distribution/ONNX/
ONNX Quantization
Search

ONNX Quantization

Creator
Creator
Seonglae Cho
Created
Created
2023 Jun 1 14:13
Editor
Editor
Seonglae Cho
Edited
Edited
2023 Sep 13 8:16
Refs
Refs
Model Quantization

32bit floating point to 8bit linear quantization

ONNX Quantization Notion
ONNX quantization pre-processing
Asymmetric quantization
Symmetric quantization
Dynamic Quantization
Static Quantization
 
 
ONNX Quantization Usages
QOperator
QDQ
Quantization on GPU
 
 
 
 
Quantize ONNX models
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Quantize ONNX models
https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html
Quantize ONNX models
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Industry/AI Distribution/ONNX/
ONNX Quantization
Copyright Seonglae Cho