espnet_onnx wav quality drop

hi, I initial text2speech using my own am_model and vocoder and export onnx model, but sound quality drops significantly, I just modify hifigan inference code in https://github.com/Masao-Someki/espnet_onnx/blob/feature/add_PWGVocoder/espnet_onnx/export/tts/models/vocoders/parallel_wavegan.py because hifigan code in repo ParalleWaveGAN does not support parameter x, and i checked Espnet am and vocoder and onnx am and vocoder, they look the same could you please offer some advises?

Nov 28 '22 08:11 1nlplearner

hi, I initial text2speech using my own am_model and vocoder and export onnx model, but sound quality drops significantly, I just modify hifigan inference code in https://github.com/Masao-Someki/espnet_onnx/blob/feature/add_PWGVocoder/espnet_onnx/export/tts/models/vocoders/parallel_wavegan.py because hifigan code in repo ParalleWaveGAN does not support parameter x, and i checked Espnet am and vocoder and onnx am and vocoder, they look the same could you please offer some advises?

when i delete postprocess code in https://github.com/Masao-Someki/espnet_onnx/blob/master/espnet_onnx/tts/tts_model.py ，model can synthesis voice as pytorch inferencing

Nov 29 '22 11:11 1nlplearner

@1nlplearner Thank you for reporting this issue.

when i delete postprocess code in https://github.com/Masao-Someki/espnet_onnx/blob/master/espnet_onnx/tts/tts_model.py ，model can synthesis voice as pytorch inferencing

It seems that the normalization process causes this issue. Would you check your config file in ~/.cache/espnet_onnx/<tag_name>/config.yml, and check if the use_normalize is set to False? I think setting the use_normalize: false will fix this problem.

Dec 01 '22 11:12 Masao-Someki

espnet_onnx espnet_onnx copied to clipboard

wav quality drop

espnet_onnx
espnet_onnx copied to clipboard