Francesco comments

Results 87 comments of


                                            Francesco

Multi-Gpu improper utilization

Good question, I incur in the same problem. Haven't solved it yet, because we mostly use 1 gpu per process (democratically sharing them :D). This is probably better sought for...

Multiple Vocal voices

I have not experimented yet but in general this should be hard but doable. The results will probably vary. You can also experiment with adding some speaker embeddings (concatenating along...

Multiple Vocal voices

Sorry I don't understand the difference with the previous question. You can do the following: - train model from scratch and see what the results look like (very likely to...

This is something I always experienced when training forward models. Durations probably easily overfit. Practically has not been an issue, despite this making harder to understand the status of training...

Does it support multi-gpu training?

Hi, not currently. It is something I'm working on.

Query regarding Model architecture

Hi, 1. you can find conv layers replacing dense layers after attention in fastspeech, for example 2. we found that this helps building attention, although with more recent improvements might...

Average training time in Google Colab with GPU

Hi, I trained the autoregressive models for about 600K steps (some less) and around the same for the forward models. This should take, if I remember correctly, about 2-3 days...

Average training time in Google Colab with GPU

Hi, batch sizes are dynamic. Samples are bucketed by duration, so the batch size depends on how many samples there are in each bin. Max sizes are specified in the...

ERROR install espeak

Maybe try 'espeak-ng' instead of 'espeakng'. Or visit http://espeak.sourceforge.net/

RuntimeError: CUDA out of memory

Hi, one quick thing you can try is switching from GPU to CPU by simply removing those lines. Unless you're predicting on a batch, batch size won't make any difference....