Getting Started with Quantization
What is Quantization?
It is the process of reducing/mapping higher precision weights and activations into lower precision. In simple terms shrinking a model to smaller size that can be used to run on resources with limited memory.
Linear Quantizatio...
siddartha10.hashnode.dev4 min read
Subhasya Tippareddy
A very clear and concise explanation. Thanks!