tacotron icon indicating copy to clipboard operation
tacotron copied to clipboard

have you tryed training with the full dataset?

Open kyoguan opened this issue 7 years ago • 10 comments

1gpu

this result is training on single gpu.

8gpu

this result is training on 8 gpu. (BTW: you need to set BN = None, or you would got a strange result , because the batch normal problem on multi gpu. )

it seems the adam optimizer problem.

kyoguan avatar Jul 05 '17 03:07 kyoguan

mean_loss

AzamRabiee avatar Jul 07 '17 06:07 AzamRabiee

My result on full dataset. in epoch#40k, I have better wav signal, as well.

AzamRabiee avatar Jul 07 '17 06:07 AzamRabiee

@AzamRabiee How is your synthesized waves like at epoch 40? Could you share some of them?

candlewill avatar Jul 07 '17 06:07 candlewill

model_epoch_40_gs_40000_1.wav.zip Text is "abraham said i will swear" this is the best synthesized wav

AzamRabiee avatar Jul 07 '17 07:07 AzamRabiee

I think you still using the sanity_check = True, this is only a very small dataset, I got the same result. can you try the sanity_check = False ?

kyoguan avatar Jul 09 '17 15:07 kyoguan

In the code, if the sanity_check = True ,you use the data is only a single mini-batch, this batch repeats 1000 times: texts, sound_files = texts[:hp.batch_size]*1000, sound_files[:hp.batch_size]*1000

zuoxiang95 avatar Jul 10 '17 04:07 zuoxiang95

I think you still using the sanity_check = True, this is only a very small dataset, I got the same result. can you try the sanity_check = False ?

I don't think it works. I tried sanity_check=False, and used 40 epochs as well, because if I use 10000 it would cost me the whole summer :( The result is a total mess. It is just repeating some certain phoneme that I cannot understand.

sniperwrb avatar Jul 25 '17 18:07 sniperwrb

I did a non sanity check training (asynchronously on 384 cores) with lr=.00015, norm=ins, loss=l1, min_len=10, max_len=100, and r=5. Additionally, I used Luong Attention, gradient clipping by norm of 5, and binning. Results are bad, but I ran out of cloud credits before I thought I was done training.

image

After reading some of the comments here, I regret using any sort of normalization. I will see if I can try a similar experiment again on a single gpu later this week, but it will be sure to take more time.

samples%2Fmodel.ckpt-231691_25.wav.zip

tmulc18 avatar Jul 26 '17 08:07 tmulc18

@tmulc18 I think normalization is somehow important. When I try to eval.py with norm=None, the result is a mess (even for the pretrained model from the tacotron author) When I try to eval.py with some normalizations, but the training did not have any, it would raise an error...

sniperwrb avatar Jul 26 '17 20:07 sniperwrb

Have you got good results? I trained the model based on the whole dataset. However, the loss is still, more than 1. And the results are not so good. Thanks a lot

jpdz avatar Jul 31 '17 07:07 jpdz