triplet_loss_kws
triplet_loss_kws copied to clipboard
Learning Efficient Representations for Keyword Spotting with Triplet Loss
Learning Efficient Representations for Keyword Spotting with Triplet Loss
Code for the paper Learning Efficient Representations for Keyword Spotting with Triplet Loss
by Roman Vygon([email protected]) and Nikolay Mikhaylovskiy([email protected]).
Prerequisites
Training
To train a triplet encoder run:
python TripletEncoder.py --name=test_encoder --manifest=MANIFEST --model=MODEL
To train a no-triplet model, or to train a classifier based on the triplet encoder run:
python TripletClassifier.py --name=test_classifier --manifest=MANIFEST --model=MODEL
You can use --help
to view the description of arguments.
Hardware Requirements
Training was performed on a single Tesla K80 12GB.
Model | Batch Size | VRAM |
---|---|---|
Res15 | 35*4 | 11GB |
Res8 | 35*10 | 4GB |
Testing
To test a triplet encoder run:
python infer_train.py --name=test_encoder --manifest=MANIFEST --model=MODEL --enc_step=ENCODER_TRAINING_STEP
To test a classifier-head model run:
python infer_notl.py --name=test_encoder --cl_name=test_classifier --manifest=MANIFEST --model=MODEL --enc_step=ENCODER_TRAINING_STEP --cl_step=CLASSIFIER_TRAINING_STEP
You can use --help
to view the description of arguments.
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Datasets
LibriSpeech
You can download the test-clean-360 here: http://www.openslr.org/12. If the site doesn't load see this code for direct links to the files.
Google Speech Commands
Use this notebook to download and prepare the Google Speech Commands dataset.
Additional files
~~Data manifests, librispeech alignments and distance measures can be found here. You'll need to update the manifests.json
file with the dataset path. You can convert LibriWords manifests with convert_path_prefix.ipynb~~
The files sadly went missing, I'll try to recover them, if anyone had a chance to download them please contact me.