sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

Empty transcription with "parakeet-tdt-0.6b-v2" on some Files

Open ahazned opened this issue 6 months ago • 6 comments

Hi,

First of all, thank you very much for adding all the latest models to sherpa-onnx. However, it seems there might be an issue with the implementation of the recently added parakeet-tdt-0.6b-v2 model. Sherpa-onnx produces empty output for some files. I'm attaching an example from the GigaSpeech test set. While both the official implementation on Hugging Face and onnx_asr are able to produce results for this file, sherpa-onnx fails to do so.

Do you have any idea what might be causing this?

Example file download link: https://drive.google.com/file/d/1Ltd3xlS0-FEabO2dAlbOnpKR3XiItahm/view?usp=sharing

  • sherpa-onnx result: {"lang": "", "emotion": "", "event": "", "text": "", "timestamps": [], "tokens":[], "words": []} (using the model folder "sherpa-onnx-nemo-parakeet-tdt-0.6b-v2")

  • huggingface implementation result: Just before he passed away, when I visited him, he said, Howard, I'm working on becoming more of an optimal person. I'm just doing it now, and he was like so energetic. (https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2)

  • onnx-asr implementation result: Just before he passed away, when I visited him, he said, Howard, I'm working on becoming more of an optimal person. I'm just doing it now, and he was like so energetic. (https://github.com/istupakov/onnx-asr)

Thanks.

ahazned avatar May 28 '25 15:05 ahazned

I am facing the same issue. I tried to investigate this a little bit and found a workaround by adding dithering.

Unfortunately, the internal dithering parameter is hardcoded to 0. (@csukuangfj, can we change this?)

So, you will have to add dithering to your input files directly.

It should have worked without dithering though, there is some issue in sherpa-onnx implementation or the model.

vsd-vector avatar Sep 29 '25 12:09 vsd-vector

can we change this?

Yes, can you make a pull request to delete that line?

The default value of dither is still 0, so it won't change the current behavior.

You can change it using the command line by

--dither=0.001

to use 0.001 for dither.

csukuangfj avatar Sep 29 '25 12:09 csukuangfj

@vsd-vector

csukuangfj avatar Sep 29 '25 12:09 csukuangfj

Image

csukuangfj avatar Sep 29 '25 12:09 csukuangfj

Image

Sorry, I don't understand. Did it work for you without removing the hardcoded value?

vsd-vector avatar Sep 29 '25 13:09 vsd-vector

Image

Sorry, I don't understand. Did it work for you without removing the hardcoded value?

No. It won't work.

I have to remove the hardco ded dither to make it work.

csukuangfj avatar Sep 30 '25 03:09 csukuangfj