conformer
conformer copied to clipboard
Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.
Pytorch Conformer
Pytorch implementation of conformer model with training script for end-to-end speech recognition on the LibriSpeech dataset.
Usage
Train model from scratch:
python train.py --data_dir=./data --train_set=train-clean-100 --test_set=test_clean --checkpoint_path=model_best.pt
Resume training from checkpoint
python train.py --load_checkpoint --checkpoint_path=model_best.pt
Train with mixed precision:
python train.py --use_amp
For a full list of command line arguments, run python train.py --help
. Smart batching is used by default but may need to be disabled for larger datasets. For valid train_set and test_set values, see torchaudio's LibriSpeech dataset. The model parameters default to the Conformer (S) configuration. For the Conformer (M) and Conformer (L) models, refer to the table below:

Other Implementations
- https://github.com/sooftware/conformer
- https://github.com/lucidrains/conformer
TODO:
- Language Model (LM) implementation
- Multi-GPU support
- Support for full LibriSpeech960h train set
- Support for other decoders (ie: transformer decoder, etc.)