Spoken-language-identification
Spoken-language-identification copied to clipboard
How to run this code
Hi, I got a speech dataset, the length of each recording ranges from 2s~5s. How I run your code with my dataset? Can you give me some advice. Thanks.
Hi,
First you should create the spectrograms of the recordings (you can use create_spectrograms.py
for that), then make training and validation list files like this and finally, run the theano/main.py
.
As the length of recordings in your dataset is not constant you should either set batch_size=1
or do something to equalize the lengths in mini-batch.
Hi,
I followed your experiment procedure and run with my dataset. My dataset contains 50 classes, 400 recordings with 5 seconds for each category.
First of all, I created spectrograms of each recording with size in 256x429. Then I split them in training set(1500 in total) and validation set(500 in total).
Last, I run the main.py
with default parameters except:
--network==tc_net_mod
I got the result:
accuracy: 0.59 percent
accuracy: 2.08 percent
saving ... states/tc_net_mod.b32.bn.epoch499.test3.91200.state
The accuracy is extremely low and I wonder there is something wrong with my experiment.
Can you give some comments?
Thanks.
getting error like this
@YichiHuang can you please share your code. I trying to learn too.