icefall issues

[WIP] swbd + fisher recipe

9

I only tested it on Switchboard alone, but ran the data prep for Fisher and I expect it to work OK. We don't have eval2000 data prep yet so it's...

pzelasko

This recipe implements Hubert transducer model. It supports finetuning a pretrained Hubert model with custom vocabulary using pruned rnnt loss. To use this recipe, you need fairseq as dependency. The...

marcoyang1998

The problems in streaming decode are in pruned_ transducer_ stateless5

8

I used my own data to train a streaming model. The recognition effect is poor when decoding. There are two obvious problems, one is to delete words at the end,...

yangsuxia

word or token confidence

Hi,how to get word confidence?Can you give me some advice?

mn7026

Support using the branch from the gigaspeech dataset for decoding

2

# WER comparison | | test-clean| test-other| comment| |---|---|---|---| |basline| 2.78 | 7.36 | --iter468000 --avg 16, greedy search| |this PR|3.36|8.20| --iter 468000 --avg 16, greedy search| It performs better...

csukuangfj

Label smoothing in transducer model

In the commit (https://github.com/k2-fsa/icefall/pull/166/commits/b49510e2bf7064f4f60650e6787288db1bad2941), icefall has been implemented label smoothing for transducer models, but now it has been removed. Why does icefall stop supporting label smoothing in transducer models.

ncakhoa

Can I use the fbank features already extracted by Kaldi to train with Icefall script?

11

For most speech datasets, we have already extracted their fbank features by `compute-fbank-feats` of Kaldi. Is it possible to generate the `(dataset_name)_cuts_train.jsonl.gz` directly using Kaldi's various List ( wav.scp, utt2spk,...

Aurora-6

New dataset preparing

1

Hi authors, Very amazing work! Is there any scripts to prepare a new dataset that I can train or test? Or to simplify, how can I test the wer of...

WangHelin1997

LSTM-transducer for the wenetspeech dataset

Will post the results soon.

csukuangfj

`StopIteration` while trying to resume training from a checkpoint

7

Hi, there. I was trying to resume the training from a checkpoint `checkpoint-28000.pt`. But it looks like the train sampler iterates until `StopIteration` ``` 2022-08-24 23:20:54,372 INFO [train.py:940] (3/4) Training...

wgb14

icefall
icefall copied to clipboard

Metadata

[WIP] swbd + fisher recipe

Finetune hubert

The problems in streaming decode are in pruned_ transducer_ stateless5

word or token confidence

Support using the branch from the gigaspeech dataset for decoding

Label smoothing in transducer model

Can I use the fbank features already extracted by Kaldi to train with Icefall script?

New dataset preparing

LSTM-transducer for the wenetspeech dataset

`StopIteration` while trying to resume training from a checkpoint

← Metadata

Owner

Metadata

icefall icefall copied to clipboard

Metadata

← Metadata

Owner

Metadata

icefall
icefall copied to clipboard