qkeras icon indicating copy to clipboard operation
qkeras copied to clipboard

GPU Inferencing in Qkeras

Open yogasrivarshan opened this issue 3 years ago • 0 comments

Hi there! I was interested in implementing the Qkeras example for MNIST CNN model as given in the examples section - Link. This examples involves quantizing the weights and activations into INT4 or 4 bits using the quantized_bits(4,0,1) method for Conv kernels and activations. Is there any way to perform GPU inferencing by converting the model into something like a TRT engine? This method is widely used for packages like NVIDIA-QAT,so I suppose there should be a way for Qkeras as well.

Thanks, Yoga

yogasrivarshan avatar Aug 11 '22 07:08 yogasrivarshan