Haowen Qiu
Haowen Qiu
> A linear FSA is always arc sorted, so I think it is not necessary to sort it before calling `k2.intersect`. That's related a bug we have before, it's generally...
It's strange that `Fsa._grad_cache` is not kept in the leaked memory? ``` (Pdb) leftover[2].byrcs[6].referrers.byrcs[0].referents Partition of a set of 4 objects. Total size = 900 bytes. Index Count % Size...
Sure, definitely we need to do this, we also plan to cache those static decoding graph at the beginning of training.
We add self-loops with symbol zero (epsilon) here so it can match `blank` in CTC training. You may want to check the code `intersect_dense_pruned` below, we do CTC alignment with...
Seems `to_dense` will generate a copy? Not sure if it's applicable ro share memory between sparse tensor and ragged tensor and do `abs` on the ragged tensor.
@Curisan just add some notes in case you are eager to learn about this before fangjun's documentation. When we train or decode, usually we feed data into nnet model batch...
RE 1 and 3. - We use `L.fst` for training, there's no diambig symbols in it. - we add self-loops (blank) in training and decoding.
As G contains disambiguation symbols bug L does not, we should use `L_disambig.fst.txt` in `data/lang_nosp` for L. We also need to determinize the intersection result on phones and then remove...
You can check the end of `words.txt`, it should contain `#0` etc.
Yeah, I have removed `` in G and I can now get determinized result of L*G, but the problem I have now is: I need to get `aux_labels` (i.e. word)...