PyTorch_Speaker_Verification icon indicating copy to clipboard operation
PyTorch_Speaker_Verification copied to clipboard

Shuffling wav files in dataloader does not ensure that all the training files are checked in each epoch

Open dkatsiros opened this issue 4 years ago • 2 comments

As a results the model is trained on N*M utterances per epoch and not the whole training set. This affects the convergence as well as possible extensions of the code (e.g. early stopping).

where: N=number of speakers per batch, M=number of utterances per speaker per batch according to the referenced paper. https://github.com/HarryVolek/PyTorch_Speaker_Verification/blob/10e159a8d3255503c0184cde4eb7097968857a31/data_load.py#L39-L40

dkatsiros avatar Nov 16 '20 19:11 dkatsiros

For TIMIT dataset, where M=9 (I think) the dataloader may be ok. The issue appears in large datasets such as VoxCeleb1 or VoxCeleb2 where M>50.

dkatsiros avatar Nov 16 '20 20:11 dkatsiros

@HarryVolek Can you check this please ? If that is the case I will pr

dkatsiros avatar Nov 16 '20 21:11 dkatsiros