voice-vector how preprocessing voxceleb data and acc==0?

how preprocessing voxceleb data and acc==0?

Open colinsongf opened this issue 6 years ago • 10 comments

how process voxceleb data for run trian.py?

why I iter 200 using 2GPU, but acc=0?

May 12 '18 09:05 colinsongf

Hi,

Got the same issue on the voxceleb_v1 dataset. I can see the loss is consistently decreasing from 7.2 to 6.01, but the eval/train accuracy is always 0. Have you solved this?

May 20 '18 15:05 ChristopherLu

i‘m not , sorry

May 21 '18 00:05 colinsongf

Is this because lack of certain pre-process step for voxceleb data?

May 21 '18 10:05 ChristopherLu

i think so, but i do not how to pre-process voxceleb data!

May 22 '18 04:05 colinsongf

how to Voxceleb dataset preprocessing for dropping silence segments

May 22 '18 07:05 colinsongf

@ChristopherLu @colinsongf I proprocessed voxceleb dataset to be sample rate 16,000 that is my config in default.yaml

May 22 '18 10:05 andabi

@andabi

Could you share us the procedure to get the 'voxceleb_norm'? Is it the data after pre-processing? We are confused about the right procedures to run the code for voxceleb, it wold be great if you could share us the recipe or pipeline to achieve this.

Thanks

May 22 '18 10:05 ChristopherLu

voxceleb_norm is the processed dataset. The dataset is structured to directories for each celeb. Each directory contains each celeb's wav files which have sample rate 16,000 and format is 'wav'. You need to preprocess above before training.

May 23 '18 00:05 andabi

after processed all wav to sample rate 16,000, the result is acc=0, why?

May 23 '18 06:05 colinsongf

Keep the training at least a few days because voxceleb is huge. I kept training the model a few days using 8 gpu to get over 90% eval accuracy.

May 23 '18 11:05 andabi

voice-vector voice-vector copied to clipboard

how preprocessing voxceleb data and acc==0?

voice-vector
voice-vector copied to clipboard