multilingual_kws icon indicating copy to clipboard operation
multilingual_kws copied to clipboard

Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus

Results 22 multilingual_kws issues
Sort by recently updated
recently updated
newest added

Hi! After creating a conda environment using the provided ```environment.yml``` file, followed by additionally installing TensorFlow 2.9.0 as mentioned in the Dockerfile, I tried to run the Jupyter Notebook's cells...

Hi I get `ERROR: Cannot find key: --keyword ` when I Run ``` docker run --gpus all -p 8080:8080 --rm -u $(id -u):$(id -g) -it \ -v $(pwd):/demo_data \ mkws...

```python uhohs = [] mswc_16khz = Path("/media/mark/hyperion/mswc/16khz_wav/en/clips") keywords = list(sorted(os.listdir(mswc_16khz))) print(len(keywords)) for keyword in tqdm.tqdm(keywords): keyword_samples = list(sorted((mswc_16khz / keyword).glob("*.wav"))) if len(keyword_samples) == 0: uhohs.append(keyword) print(len(uhohs)) >>> 24 ```

When running the intro tutorial notebook in the docker container for `tensorflow/tensorflow:latest-gpu-jupyter` the `umap` library can't be installed because `numba` only works on `numpy

Impacts certain languages more heavily than others (French, Kinyarwanda, ...) [empty_directories.txt](https://github.com/harvard-edge/multilingual_kws/files/7514971/empty_directories.txt)

in German, 'null' (zero) is being converted to `NaN` by pandas when it is the only word present in the transcript (due to single-word-target-segments data) One option is to use...

MSWC

Precision/Recall/F1 displayed with the visualizer and updated as confidence interval is swept

given two transcripts 1. [hello is a common greeting] and 2. [she said, “hello”], without punctuation filtering we would otherwise treat [hello] and [“hello”] as separate words

MSWC