crnn-lid icon indicating copy to clipboard operation
crnn-lid copied to clipboard

Question

Open lvaleriu opened this issue 7 years ago • 4 comments

Why do we need to do this in fact: "Use ffmpeg to convert and split WAV files into 10 second parts"?

After downloading we have big wav files. We can then directly convert them to spectogram image files. This will slice anyway the image into 10 seconds spectograms.

lvaleriu avatar Oct 11 '18 23:10 lvaleriu

Of course you can also do it in this way... if you think that this works better for you, then go ahead...

Bartzi avatar Oct 12 '18 08:10 Bartzi

It is mainly because i dont need to store segment wav files too (which is 88 gb on my disk). I already store the youtube downloaded files directly to mp3 now for the same reason. And i've managed to extract 10 seconds spectograms from the mp3s quite fast actually.

lvaleriu avatar Oct 12 '18 08:10 lvaleriu

how much amount of data I should use for classifying between Hindi and English? is 20000 spectrogram per language is sufficient ?

omfuke avatar Nov 12 '20 09:11 omfuke

Sounds like a good amount of data. I think it could work!

Bartzi avatar Nov 12 '20 09:11 Bartzi