icefall issues

[Not for Merge]: Visualize the gradient of each node in the lattice.

7

This PR visualizes the gradient of each node in the lattice, which is used to compute the transducer loss. The following shows some plots for different utterances. You can see...

csukuangfj

[BUG] moving average calculation of tot_loss metrics is wrong.

https://github.com/k2-fsa/icefall/blob/05e7435d0d36789ba947b3ca664b730b8d79cb92/icefall/transformer_lm/train.py#L416 the formular of moving average calculation should be: `y = (1-k) * y + k * x` with `1-k` being the keep propotion. so the mentioned line should be...

chenjiasheng

How to filter annotation(transcript) and choose suitable corpus for ASR

I plan to train an ASR model using own data with wenetspeech in egs. I want to know how the quality of annotation, good or bad, and different scene of...

AI-X-King

support combining two n-gram LMs

2

csukuangfj

Mismatch between `L.pt` and `words.txt` in GigaSpeech `prepare.sh`

In the `prepare.sh` of GigaSpeech: https://github.com/k2-fsa/icefall/blob/master/egs/gigaspeech/ASR/prepare.sh#L188 `L.pt` is relative to words, and words are generated by `lexicon.txt`: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang.py#L354 Then `words.txt` is generated by the transcript words in `gigaspeech_supervisions_XL.jsonl.gz`: https://github.com/k2-fsa/icefall/blob/master/egs/gigaspeech/ASR/prepare.sh#L191-L238 This...

yfyeung

Different Training Loss with Single Node (8 GPUs) vs. Two Nodes (4 GPUs Each)

Description: I am experiencing a discrepancy in training loss when using different GPU configurations for training the Zipformer model. Specifically, I observe different training loss patterns when training on a...

dohe0342

Zipformer recipe for ReazonSpeech

6

**ReazonSpeech** is an open-source dataset that contains a diverse set of natural Japanese speech, collected from terrestrial television streams. It contains more than 35,000 hours of audio. The dataset is...

Triplecq

Add zipformer to VoxPopuli

(Still working on it)

daniel-dona

Support multi-node multi gpu, fbank frontend, BestRQ for k2SSL

Still testing, waiting for some results.

yfyeung

pytorch ver. `>=2.1.0` breaks compatibility with all `conformer_ctc` recipes

pytorch added two additional parameters to their implementation of the class TransformerDecoder, see https://github.com/pytorch/pytorch/blame/94b328ee4592605f490d422f57ad4747a92ac339/torch/nn/modules/transformer.py#L498 and https://github.com/pytorch/pytorch/pull/97166 the modification breaks all `conformer_ctc` recipes (and possibly other recipes i haven't looked into),...

JinZr

icefall
icefall copied to clipboard

Metadata

[Not for Merge]: Visualize the gradient of each node in the lattice.

[BUG] moving average calculation of tot_loss metrics is wrong.

How to filter annotation(transcript) and choose suitable corpus for ASR

support combining two n-gram LMs

Mismatch between `L.pt` and `words.txt` in GigaSpeech `prepare.sh`

Different Training Loss with Single Node (8 GPUs) vs. Two Nodes (4 GPUs Each)

Zipformer recipe for ReazonSpeech

Add zipformer to VoxPopuli

Support multi-node multi gpu, fbank frontend, BestRQ for k2SSL

pytorch ver. `>=2.1.0` breaks compatibility with all `conformer_ctc` recipes

← Metadata

Owner

Metadata

icefall icefall copied to clipboard

Metadata

← Metadata

Owner

Metadata

icefall
icefall copied to clipboard