icefall
icefall copied to clipboard
This PR visualizes the gradient of each node in the lattice, which is used to compute the transducer loss. The following shows some plots for different utterances. You can see...
https://github.com/k2-fsa/icefall/blob/05e7435d0d36789ba947b3ca664b730b8d79cb92/icefall/transformer_lm/train.py#L416 the formular of moving average calculation should be: `y = (1-k) * y + k * x` with `1-k` being the keep propotion. so the mentioned line should be...
I plan to train an ASR model using own data with wenetspeech in egs. I want to know how the quality of annotation, good or bad, and different scene of...
In the `prepare.sh` of GigaSpeech: https://github.com/k2-fsa/icefall/blob/master/egs/gigaspeech/ASR/prepare.sh#L188 `L.pt` is relative to words, and words are generated by `lexicon.txt`: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang.py#L354 Then `words.txt` is generated by the transcript words in `gigaspeech_supervisions_XL.jsonl.gz`: https://github.com/k2-fsa/icefall/blob/master/egs/gigaspeech/ASR/prepare.sh#L191-L238 This...
Description: I am experiencing a discrepancy in training loss when using different GPU configurations for training the Zipformer model. Specifically, I observe different training loss patterns when training on a...
**ReazonSpeech** is an open-source dataset that contains a diverse set of natural Japanese speech, collected from terrestrial television streams. It contains more than 35,000 hours of audio. The dataset is...
(Still working on it)
Still testing, waiting for some results.
pytorch added two additional parameters to their implementation of the class TransformerDecoder, see https://github.com/pytorch/pytorch/blame/94b328ee4592605f490d422f57ad4747a92ac339/torch/nn/modules/transformer.py#L498 and https://github.com/pytorch/pytorch/pull/97166 the modification breaks all `conformer_ctc` recipes (and possibly other recipes i haven't looked into),...