icefall
icefall copied to clipboard
Non-streaming Conformer model with pruned_rnnt_loss always emits the first non-blank characters on the very first frames.
I trained two offline reworked conformer models on my own Chinese data using pruned_rnnt_loss and standard rnnt loss (warp-rnnt==0.7.0) respectively following pruned_transducer_stateless5. However, I still experience the issue where the first word timestamp is aways zero with the conformer + pruned_rnnt_loss. While the conformer + standard rnnt loss does not have this phenomenon.
I have seen the comments in : https://github.com/k2-fsa/icefall/issues/1347 https://github.com/k2-fsa/icefall/pull/942 https://github.com/k2-fsa/icefall/issues/923 https://github.com/k2-fsa/sherpa/pull/52
But I still don't know how to avoid similar problems. Is there any way to solve such problems? Thanks!