CaraDuf comments

Results 43 comments of


                                            CaraDuf

How to read scorer output ?

That's really weird. I restarted the computer for another reason and now it keeps the 87 / 87 datapoints (without touching anything to the dataset). I may have screwed up...

Number of fine tuning steps recommended when fine tuning to avoid overfitting

Ok thank you. So should I change the learning rate in the fine tuning script or it already takes the 1/10 factor into account ?

New ToucanTTS model gives far worse results after finetuning

I looked at the spectrograms and the tips are moving from left to right but the horizontal stripes are there. I haven't looked at the losses, I'll tell you. I...

New ToucanTTS model gives far worse results after finetuning

To answer your previous questions the L1 loss, Glow loss, and spectrogram (before or after I don't remember for sure) look like the following : ![image](https://user-images.githubusercontent.com/91517923/231934312-060d653d-d49c-47f6-84c8-a4b0cfb0c63c.png) ![image](https://user-images.githubusercontent.com/91517923/231934387-27523b17-fc4d-4b2d-9eb4-6dd330230c81.png) ![image](https://user-images.githubusercontent.com/91517923/231935010-43177c7b-36db-446b-a025-c3a5aa644afb.png) L1 loss...

New ToucanTTS model gives far worse results after finetuning

@thoraxe the loss images are from wandb. You have to set up an account and then pass the parameter--wandb

New ToucanTTS model gives far worse results after finetuning

Quick feedback from my side. After pausing the cloning for one week and restarting the computer, it works great (v2.5). I will try to improve the dataset with adobe api...

Use Pytorch Lightning to speed up training ?

Ok thanks for your reply. Now I have to learn what "normalizing flows" means (specially what a flow is in TTS) 😉.

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Did you try https://github.com/DigitalPhonetics/IMS-Toucan/issues/88 ?

Nasal ɛ̃ ("in" in French) mispronounced only on female voices (not on male voices)

Actually I do believe you already solved all that in v2.5 (so sorry to ask a "backward" question but v2.4 works better for me as voice similarity is concerned). Would...

Nasal ɛ̃ ("in" in French) mispronounced only on female voices (not on male voices)

Fine tuning (6k steps overall) Meta on Siwis dataset and then finetuning (6k steps overall) the resulting model (Siwis) on my dataset gave better results but not perfect. "pin" (pine...