hifi-gan icon indicating copy to clipboard operation
hifi-gan copied to clipboard

What the batch size was used in paper?

Open Oktai15 opened this issue 3 years ago • 7 comments

@jik876 I see batch_size=16 in config, but I want to clarify that batch size was equaled to 16 per GPU, right? And you used 2 V100 for training with this batch size?

Oktai15 avatar May 30 '21 21:05 Oktai15

https://github.com/jik876/hifi-gan/blob/master/train.py#L259 total batch size is constant regardless of the number of GPUs

CookiePPP avatar May 31 '21 01:05 CookiePPP

@CookiePPP great, thanks! Then I need number of total batch size that was used :)

Oktai15 avatar May 31 '21 02:05 Oktai15

I found that when there is only ONE speaker in training dataset, when I changed the batch_size = 16 * num(GPU), the result wavform would contains some noise like reverberation, which did not happen when I used TWO or MORE speaker dataset.

JohnHerry avatar Sep 28 '21 01:09 JohnHerry

I found that when there is only ONE speaker in training dataset, when I changed the batch_size = 16 * num(GPU), the result wavform would contains some noise like reverberation, which did not happen when I used TWO or MORE speaker dataset.

@JohnHerry I ran into the same problem. What do you think is the reason for the single-seaker dataset or the reason for batch_size.

https://github.com/jik876/hifi-gan/blob/master/train.py#L259) total batch size is constant regardless of the number of GPUs

yygg678 avatar Oct 20 '21 09:10 yygg678

@yyggithub I do not realy know the reason. But I noticed that HifiGAN MelDataset "shuffle" is down on multi-GPU training. On torch distributed multi-GPU training method, traning process on each GPU some what more “stand alone”. I guess if different GPU will always see identical subset of the whole training dataset, The batch_size is bigger, the inconsistent between GPUs get largger.

JohnHerry avatar Oct 22 '21 02:10 JohnHerry

@yyggithub I do not realy know the reason. But I noticed that HifiGAN MelDataset "shuffle" is down on multi-GPU training. On torch distributed multi-GPU training method, traning process on each GPU some what more “stand alone”. I guess if different GPU will always see identical subset of the whole training dataset, The batch_size is bigger, the inconsistent between GPUs get largger.

Good, Have you tried to use single-speaker data on single GPU will have this problem?

yygg678 avatar Oct 22 '21 02:10 yygg678

@yyggithub I do not realy know the reason. But I noticed that HifiGAN MelDataset "shuffle" is down on multi-GPU training. On torch distributed multi-GPU training method, traning process on each GPU some what more “stand alone”. I guess if different GPU will always see identical subset of the whole training dataset, The batch_size is bigger, the inconsistent between GPUs get largger.

Good, Have you tried to use single-speaker data on single GPU will have this problem?

Yes, traing with single-speaker dataset on single GPU would be fine.

JohnHerry avatar Oct 23 '21 07:10 JohnHerry