Piotr Żelasko comments

Results 523 comments of


                                            Piotr Żelasko

Use MMI not CTC model for alignment

Would it make sense to use a pure TDNN/TDNNF/CNN model for alignments? I was investing alignments from the conformer recently and my feeling was that they weren't perfect (even though...

Use MMI not CTC model for alignment

I'll submit a PR with the code that allows computing alignments and visualizing them later. As to data augmentation of alignments, we could extend most transforms to handle it --...

Use MMI not CTC model for alignment

Regarding this: it's actually weird that CTC and MMI alimdl would not make a difference. Some time ago, I think I looked at both CTC and MMI posteriors, and they...

Naming problem RE sequence_idx

FWIW I believe we were using the "cut concatenation" mechanism that packs multiple cuts+supervisions in a single "sequence" for several months now. IF there was an issue, I think all...

Label smoothing for LF-MMI

Since we have an ali model then maybe another option is to add frame wise cross entropy loss using that alignment, and apply the label smoothing there?

Inconsistency between the reported loss and the actual loss used for gradient computation

Could also be surprising if we want to try out a different optimizer that doesn't have Adam-like gradient scaling.

DDP address in use

We might need to make the port configurable. For a quick work-around you can change it here: https://github.com/k2-fsa/snowfall/blob/master/snowfall/dist.py#L8

DDP address in use

We can choose it randomly - although I think with `torch.distributed.launch` we'd have to choose it *outside* of the python script, and with `torch.distributed.spawn` we can choose it inside the...

DDP address in use

Hmm, I've never seen this one before...

DDP address in use

Oooh, now it all finally makes sense. Thanks for debugging this guys. I'll add a fix to the cut ids partitioning in the sampler.