speaker-recognition-pytorch icon indicating copy to clipboard operation
speaker-recognition-pytorch copied to clipboard

How to use your own dataset

Open nidhal1231 opened this issue 6 years ago • 5 comments

Hello, this code is for TIMIT data set. If you change it into your own audio data, it will not work. If you use your own audio data as a data set, how can I do it? Thank you very much for your guidance.

nidhal1231 avatar May 07 '19 12:05 nidhal1231

@nidhal1231 if you don't kown how to prepare your own datasets,you can change the structure of your own datasets similar to TIMIT. in fact , you can also modify the path in config/conf.yaml to use your own datasets.

Aurora11111 avatar May 13 '19 02:05 Aurora11111

@nidhal1231 and you should change your audio to wav. subprocess.call(['ffmpeg', '-i', wav, wav[:-4] + '.wav'])

Aurora11111 avatar May 13 '19 02:05 Aurora11111

@nidhal1231 I upload some code I used to create numpy files to uis-rnn,you can have a reference.

Aurora11111 avatar May 13 '19 03:05 Aurora11111

@Aurora11111 you can't create numpy files to uis-rnn because UIS_RNN must use diarization datasets in which each utterance has multiple speakers speaking in turn. Can you explain more how hange the structure of your own datasets similar to TIMIT? and what is the structure of TIMIT dataset? Thank you in advance.

nidhal1231 avatar May 13 '19 09:05 nidhal1231

@nidhal1231 maybe you should download the TIMIT datasets and take care the folder of it.

Aurora11111 avatar May 14 '19 02:05 Aurora11111