FastSpeech2
FastSpeech2 copied to clipboard
Why do we need `batch_size=batch_size * group_size`? What is the point of the `group_size` variable?
How can group_size larger than 1 will enable sorting in Dataset ?
or Why do we need enable sorting in Dataset?
https://github.com/ming024/FastSpeech2/blob/d4e79eb52e8b01d24703b2dfc0385544092958f3/train.py#L31
batch_size = train_config["optimizer"]["batch_size"]
group_size = 4 # Set this larger than 1 to enable sorting in Dataset
assert batch_size * group_size < len(dataset)
loader = DataLoader(
dataset,
batch_size=batch_size * group_size,
shuffle=True,
collate_fn=dataset.collate_fn,
)
@chenming6615 Because in the collate_fn function, we need to sort all the utterances in a batch, which will cost time. If we set group_size > 1, we can sort all the utterances in a big batch, then split the big batch into several small batches for training, which will save time.
@chenming6615 Because in the collate_fn function, we need to sort all the utterances in a batch, which will cost time. If we set group_size > 1, we can sort all the utterances in a big batch, then split the big batch into several small batches for training, which will save time.
Thanks! But why do we need to sort all the utterances in a batch? Will it improve the training speed or accuracy?