voicemap icon indicating copy to clipboard operation
voicemap copied to clipboard

Improvment to 97% accuracy

Open Ananas120 opened this issue 5 years ago • 4 comments

Hello, i don’t know if this repo is active but if i can help, i used this repo for my project and i find a method to improve the loss / accuracy just by L2-normalizing the output of the encoder My score with it is actually 0.97% accuracy and 0.02 val-BCEloss training for 25 epochs on a mixt of LibriSpeech and CommonVoice (fr) datasets (360 speakers in train set and 150 in validation set with 200 pairs for each speaker (100 same and 100 not same) (batch_size of size 32 (16 same and 16 not) with embedding dim 64)

Ananas120 avatar Aug 13 '20 16:08 Ananas120

@Ananas120 Hi, are you using python 2.7 or 3.*?

Vatsalparsaniya avatar May 06 '21 09:05 Vatsalparsaniya

I re-implemented your model architecture in python 3 in my personal project

Ananas120 avatar May 06 '21 11:05 Ananas120

Thank you for your response, I have just reimplemented train_siamese.py code in python 3 and currently training on google colab. image

I'm getting accuracy value == 0.5000 since first training step. Is it normal or I'm missing something. if it is possible can you share training log?

Thank you.

Vatsalparsaniya avatar May 06 '21 11:05 Vatsalparsaniya

Yes it is normal ! I found that siamese networks need some epoch (sometimes more than 5 or 10 !) to increase their accuracy and this can be explained as follow : they try to separate « same » and « not same » from a scalar value (distance) so they have to find the right threshold before having a better score (before that they always predict either 0 or 1)

Ananas120 avatar May 06 '21 11:05 Ananas120