FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

Why do we need `batch_size=batch_size * group_size`? What is the point of the `group_size` variable?

Open chenming6615 opened this issue 3 years ago • 2 comments

How can group_size larger than 1 will enable sorting in Dataset ?

or Why do we need enable sorting in Dataset?

https://github.com/ming024/FastSpeech2/blob/d4e79eb52e8b01d24703b2dfc0385544092958f3/train.py#L31

batch_size = train_config["optimizer"]["batch_size"]
    group_size = 4  # Set this larger than 1 to enable sorting in Dataset
    assert batch_size * group_size < len(dataset)
    loader = DataLoader(
        dataset,
        batch_size=batch_size * group_size,
        shuffle=True,
        collate_fn=dataset.collate_fn,
    )

chenming6615 avatar Dec 13 '21 13:12 chenming6615

@chenming6615 Because in the collate_fn function, we need to sort all the utterances in a batch, which will cost time. If we set group_size > 1, we can sort all the utterances in a big batch, then split the big batch into several small batches for training, which will save time.

Georgehappy1 avatar Dec 27 '21 13:12 Georgehappy1

@chenming6615 Because in the collate_fn function, we need to sort all the utterances in a batch, which will cost time. If we set group_size > 1, we can sort all the utterances in a big batch, then split the big batch into several small batches for training, which will save time.

Thanks! But why do we need to sort all the utterances in a batch? Will it improve the training speed or accuracy?

chenming6615 avatar Feb 06 '22 08:02 chenming6615