Q-ASR Reproducing results from paper

Reproducing results from paper

Open auphelia opened this issue 2 years ago • 0 comments

Hi,

I would like to reproduce your results from the paper "Integer-only Zero-shot Quantization for Efficient Speech Recognition" for int8 (or even int4 if possible) QuartzNet 15x5 on an A10 and A100 Nvidia GPU with additional measurements for the throughput.

I was trying to use the Q-ASR repo for that but I cannot find the TensorRT export, is that published somewhere else? If I understand the code in the repo correctly, then the execution in inference.py does not make use of the tensor cores of the GPU. Am I overlooking something here?

Kind regards

May 04 '22 12:05 auphelia

Q-ASR Q-ASR copied to clipboard

Reproducing results from paper

Q-ASR
Q-ASR copied to clipboard