wav2letter icon indicating copy to clipboard operation
wav2letter copied to clipboard

Facebook AI Research's Automatic Speech Recognition Toolkit

wav2letter++

CircleCI Join the chat at https://gitter.im/wav2letter/community

Important Note:

wav2letter has been moved and consolidated into Flashlight in the ASR application.

Future wav2letter development will occur in Flashlight.

To build the old, pre-consolidation version of wav2letter, checkout the wav2letter v0.2 release, which depends on the old Flashlight v0.2 release. The wav2letter-lua project can be found on the wav2letter-lua branch, accordingly.

For more information on wav2letter++, see or cite this arXiv paper.

Recipes

This repository includes recipes to reproduce the following research papers as well as pre-trained models. All results reproduction must use Flashlight <= 0.3.2 for exact reproducability. Papers contained here include:

  • Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets
  • Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
  • Kahn et al. (2020): Self-Training for End-to-End Speech Recognition
  • Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition
  • Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

Data preparation for training and evaluation can be found in data directory.

Building the Recipes

First, install Flashlight (using the 0.3 branch is required) with the ASR application.

mkdir build && cd build
cmake .. && make -j8

If Flashlight or ArrayFire are installed in nonstandard paths via a custom CMAKE_INSTALL_PREFIX, they can be found by passing

-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake

when running cmake.

Join the wav2letter community

License

wav2letter++ is MIT-licensed, as found in the LICENSE file.