iLID
iLID copied to clipboard
what does the training data look like?
Hi, the link to the training data repo is dead, could you fix that? I cant run the train.py while I dont know what is in the trainingData.csv
Thank you for your interest. The training data repo does not exist anymore and we never published the raw audio files. However, you can still use the download scripts in the /data directory to download the same audio files directly from Voxforge or YouTube. To convert these to spectrogram images either use the scripts in */preprocessing (see readme) or use something like SoX.
To read the images for Caffe I recommend converting a directory of images into a LevelDB files. Use Caffe's helper for that: https://github.com/BVLC/caffe/blob/master/tools/convert_imageset.cpp
IIRC the Tensorflow code never worked as intended. The trainingData.csv
is a CSV file looking like this:
path_to_first_training_image.png, 0
path_to_other_training_image.png, 2
path_to_other_training_image.png, 1
...
So it is one path to an image followed by its class (0 - EN, 1 - DE, 2 - FR, 3 - ES).