deepspeech-cleaner
deepspeech-cleaner copied to clipboard
Not working with CommonVoice data
Hey! First of all, thanks for this great little helper.
I think CommonVoice recently changed its download format and this library is not working with that anymore. I had to download the files manually, convert .tsv
files to .csv
(yes, CV gives you .tsv now), change a bit of code in downloader.py
to make it work. Just letting you know.
Small Perl script for .tsv -> .csv conversion = perl -lpe 's/"/""/g; s/^|$/"/g; s/\t/","/g' < input.tsv > output.csv
Also I changed line 193 in downloader.py
, ...['/cv-other-train.csv
to ...['/train.tsv
.