Piotr Żelasko

Results 523 comments of Piotr Żelasko

I'll look into it. I'm also looking at other aspects of the recipe - e.g. we're currently using position dependent phones, so we're getting 4x the number of output symbols...

I changed the phones to position independent and that's how an example of posteriors looks like in an unmodified model (the first is the "as-is" output, the other one is...

(this is in the middle of training, i.e. checkpoint from epoch 5)

Update: never mind, it was not the latest k2, and the WER for that model is still 99%. Almost all the hypotheses are empty texts. Will keep looking.

FYI I ran the full 960h librispeech training (with speed perturbation) with the CTC graph, the WER is: ``` 2021-01-03 08:53:22,337 INFO [decode.py:217] %WER 10.05% [5285 / 52576, 725 ins,...

Cool! In that case I'll re-attempt this.

@zhu-han I tried again with your fix, but I'm still getting the following error: ``` File "./mmi_att_transformer_train.py", line 104, in get_objf nnet_output, encoder_memory, memory_mask = model(feature, supervision_segments) File "/home/hltcoe/pzelasko/miniconda3/envs/k2env/lib/python3.7/site-packages/torch/nn/modules/module.py", line...

You were right about that issue in Lhotse - the musan mixing code sometimes truncated too much of the original utterance. I fixed it (gonna merge as soon as the...

To make things easier, I confirmed that the issue does no arise regardless of the `duration_factor` setting in the LSTM recipe (`mmi_bigram_train.py`).