PyTorch_Speaker_Verification
PyTorch_Speaker_Verification copied to clipboard
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
Can I sample the audio at 22050hz and not at 16000 hz. Will I have to just change the hparams or models parameters also?
UnboundLocalError: local variable 'batch_id' referenced before assignment
I'm training with the CPU backend for Torch with Python3.7 on anaconda with a quad-core CPU, and it seems very slow. After running for 12 hours, it had only completely...
Hi! Could you please explain how to run dvector_create.py on the TIMIT dataset? This program tries to load some .wav files (line 91). However, the original data in TIMIT are...
Use DataParallel model to start a multi-gpu training, change the config.yaml batch size, can not speed up the training.
In `dvector_create.py`, an audio file is converted to a sequence of dvectors. However, the time index of each dvector is lost, so if a classification is performed using that dvector,...
Running dvector_create.py with default arguments and the TIMIT data throws the exception: Traceback (most recent call last): File "dvector_create.py", line 94, in times, segs = VAD_chunk(2, folder+'/'+file) File "PyTorch_Speaker_Verification/VAD_segments.py", line...
Hi, Thanks for this work. I am using the output of dvector_create.py as input to uis-rnn. Diarization is also done. But I have a small confusion on the number of...
Running the test method throws an exception: File "./train_speech_embedder.py", line 117, in test enrollment_batch, verification_batch = torch.split(mel_db_batch, int(mel_db_batch.size(1)/2), dim=1) ValueError: too many values to unpack (expected 2)
I can't understand well the process of calculating EER . I would be greatful if someone could explain this to me .