qkeras
qkeras copied to clipboard
GPU Inferencing in Qkeras
Hi there! I was interested in implementing the Qkeras example for MNIST CNN model as given in the examples section - Link. This examples involves quantizing the weights and activations into INT4 or 4 bits using the quantized_bits(4,0,1) method for Conv kernels and activations. Is there any way to perform GPU inferencing by converting the model into something like a TRT engine? This method is widely used for packages like NVIDIA-QAT,so I suppose there should be a way for Qkeras as well.
Thanks, Yoga