multilingual_kws
multilingual_kws copied to clipboard
Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus
Changes: - `EDA` folder which has scripts for, well you guessed it, Exploratory Data Analysis - `extraction` has consolidated scripts for extraction. - `packaging` has scripts used for tarring, uploading,...
* this verification should only apply to words that have > the minimum number of clips for defining a split * currently the tarballs for audio has the full path...
Most of our current alignments are for Common Voice 3/4, so re-running the alignments should create a lot more data. Low priority as of now.