icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Non-streaming Conformer model with pruned_rnnt_loss always emits the first non-blank characters on the very first frames.

Open guoyifan97 opened this issue 1 year ago • 1 comments

I trained two offline reworked conformer models on my own Chinese data using pruned_rnnt_loss and standard rnnt loss (warp-rnnt==0.7.0) respectively following pruned_transducer_stateless5. However, I still experience the issue where the first word timestamp is aways zero with the conformer + pruned_rnnt_loss. While the conformer + standard rnnt loss does not have this phenomenon.

I have seen the comments in : https://github.com/k2-fsa/icefall/issues/1347 https://github.com/k2-fsa/icefall/pull/942 https://github.com/k2-fsa/icefall/issues/923 https://github.com/k2-fsa/sherpa/pull/52

But I still don't know how to avoid similar problems. Is there any way to solve such problems? Thanks!

guoyifan97 avatar Jun 24 '24 07:06 guoyifan97