quantizing to int values
Hello, I have used your QAT model to quantize to different bitwidths, but I saw that the quantizations were always to FP values, even if they were quantized (e.g., if I quantized to 4bit, then all my weight were quantized to 16 discrete values, but they were not integers but rather FP values, namely non-integers.
I was wondering if there is a way to perform the QAT with a quantization-technique that quantizes to integers, so at to it would be more hardware-efficient. Thank you.
Currently, TF does not have the required kernels to run all quantized operations. So we are emulating quantization with float. In order to run this our users are generally converting to TFLite.
However, if you have custom TF kernels you would like to use on your hardware, you are welcome to create the scheme and registry required. We are working on opening up the API to allow custom quantization registry.
Can any help me on how to create that registry ?