icefall
icefall copied to clipboard
I will double check that I am not missing something. cc @csukuangfj
Collecting environment information... k2 version: 1.24.3 Build type: Release Git SHA1: 42e92fdd4097adcfe9937b4d2df7736d227b8e85 Git date: Wed Jun 28 09:50:36 2023 Cuda used to build k2: 11.6 cuDNN used to build k2:...
We just created a colab notebook to show the RTF ([Real-time factor](https://openvoice-tech.net/index.php/Real-time-factor)) of the [latest zipformer](https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/zipformer) transducer model. We use [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) for the CPU test. As for the GPU test,...
This PR adds ImageNet dataloader and training script for [Swin-Transformer](https://github.com/microsoft/Swin-Transformer/) using ScaledAdam. Currently it requires https://github.com/huggingface/pytorch-image-models mainly for image data processing.
I noticed that in the updated zipformer receipt, the shuffled cuts are not used! Is it on purpose? https://github.com/k2-fsa/icefall/blob/d667dc365b3259179c9d54bd32a1bb2bd8afa4f0/egs/librispeech/ASR/zipformer/train.py#L1179
When I used the unigram subword method on Japanese, the WER got by fast_beam_search is expected, but when I used bpe subword, fast_beam_search got a lot of deletion errors as...
issue: https://github.com/lhotse-speech/lhotse/issues/1096
This is the initial version of Libriheavy, the transcribed version of [Libri-Light](https://arxiv.org/abs/1912.07875). The transcriptions are obtained by aligning the book-level reference text with the output of an ASR system. The...
I have exported a non-streaming Zipformer model to ONNX format and I have used the `offline-websocket-client-decode-files-paralell.py` and `offline_decode.py` scripts to decode audio files. My objective is to decode files parallelly...
Speech Commands: https://arxiv.org/pdf/1804.03209.pdf `epoch 28 avg 2` | metrics | result | | -- | -- | | True Positive | 2296 | | False Negative | 479 | |...