quartznet-pytorch
quartznet-pytorch copied to clipboard
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
quartznet-pytorch
Automatic Speech Recognition (ASR) on pytorch. Re-implementation on pytorch of Nvidia's Quartznet.
Features:
- Youtokentome tokenization with BPE dropout
- Augmentations: custom and audiomentations
- 3 datasets support: CommonVoice, Librispeech and LJSpeech
- Weights & Biases logging
- CTC beam search interation
- GPU-based MelSpectrogram
Trained models:
dataset | wer using dummy decoder | wer with ctc beam search | wer finetuned dummy decoder | wer finetuned ctc beam search |
---|---|---|---|---|
LJspeech | 36.66 | 34.45 | 28.41 | 27.19 |