Dynamic Quantization

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Jan 15 16:17
Editor
Edited
Edited
2024 Jan 15 16:18
Refs
Refs
Only the weights of the model are quantized ahead of time, while the activations are quantized on-the-fly during inference. Weights are statically quantized, but activations are dynamically quantized based on their actual runtime values.
 
 
 
 
 
 
 

Recommendations