Edresson Casanova

Results 41 comments of Edresson Casanova

@Youyoun Were you able to improve your WER?

Hi, You can find the labels file [here](http://immortal.multicomp.cs.cmu.edu/CMU-MOSEI/labels/CMU_MOSEI_Labels.csd) and You can use [this script](https://github.com/Strong-AI-Lab/emotion/blob/21c02f1e8ce96796cf3e9281e8aa0461fe3c7479/datasets/CMU-MOSEI/process.py#L41-L70) to preprocess.

Hello, the WER interpretation is a bit different, use Levenshtein's distance. Look: The WER is defined as the editing / Levenshtein distance on word level divided by the amount of...

Voice conversion inference support was introduced on [TTS v0.6.2](https://github.com/coqui-ai/TTS/releases/tag/v0.6.2) (I recommend using always the last one because of bug fixes). Between versions, we have changed config parameters not the model...

Hi, Yes the better is to fine-tune mentioned model. > How many hours of audio is needed to have appropriate quality? We didn't analyze the number of hours needed to...

> @Edresson Thank you very much for your help! > > As far as I understand I also need to add "charecters" of my language to config.json, am I right?...

Hi, The article was made using my Coqui TTS fork on the branch [multilingual-torchaudio-SE](https://github.com/Edresson/Coqui-TTS/tree/multilingual-torchaudio-SE/). To replicate the training you can use this branch and with the config.json available with each...

@annaklyueva You can work around it setting the "meta_file_val" to "datasets/first_voice/metadata.txt". In this way, you will have the same data in training and validation it is not recommended in most...

> Thanks, but the instruction on Coqui TTS is for TTS, not voice conversion though, right? If so, is there any colab/documentation on finetuning voice conversion? Voice conversion and TTS...

Hi, Looks like that use_d_vector is false. So it means that you are not using external speakers embeddings. The demo provided only works for external speaker embeddings training. You need...