merlin icon indicating copy to clipboard operation
merlin copied to clipboard

Generated voice is unclear and noisy

Open mirfan899 opened this issue 5 years ago • 7 comments

I've trained the model for Cantonese, using (https://github.com/Jackiexiao/MTTS) frontend with modification for Cantonese(https://github.com/mirfan899/MTTS). Model is trained and wav files are generated. But audio is noisy and unclear. I've attached the logs for the reference. output.log and generated audio sample. ASR1.wav.zip

mirfan899 avatar Jul 26 '19 11:07 mirfan899

Maybe you can try to increase the acoustic training epochs. What vocoder are you using? You can try to extract audio features and resynthesis by that vocoder to check vocoder quality on your audio file.

HiiamCong avatar Aug 02 '19 03:08 HiiamCong

It seems my question file is not good enough. I was looking for question file info but did not find anywhere on the internet. If you have any link related to question file format and details of the questions, share it.

mirfan899 avatar Aug 05 '19 09:08 mirfan899

does the frontend you've used not provide a question file? Else, you should figure out which features it generates in the label files and write a matching question file. Feel free toshare it back into this repository under https://github.com/CSTR-Edinburgh/merlin/tree/master/misc/questions Perhaps the mandarin question file can provide a starting point for the cantonese?

RasmusD avatar Aug 05 '19 09:08 RasmusD

I've used https://github.com/Jackiexiao/MTTS for this purpose. I've generated the question file for Cantonese using Mandarin structure. After adding more data, generated voice has some words yet some words are not clear.

mirfan899 avatar Aug 05 '19 12:08 mirfan899

How much data are you using?

RasmusD avatar Aug 05 '19 13:08 RasmusD

Currently 220 audios.

mirfan899 avatar Aug 05 '19 13:08 mirfan899

That's probably the issue then. I'd recommend at least 1000 sentences. Preferably a lot more for high quality synthesis.

RasmusD avatar Aug 05 '19 15:08 RasmusD