gst-kaldi-nnet2-online icon indicating copy to clipboard operation
gst-kaldi-nnet2-online copied to clipboard

Word-alignment problem

Open arielvsp opened this issue 7 years ago • 1 comments

Hi,

Running the demo script (./transcribe-audio.sh dr_strangelove.mp3) produces the following output and hangs:

LOG ([5.2.64~1-2fbf2]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG ([5.2.64~1-2fbf2]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
WARNING ([5.2.64~1-2fbf2]:LatticeWordAligner():word-align-lattice.cc:263) [Lattice has input epsilons and/or is not input-deterministic (in Mohri sense)]-- i.e. lattice is not deterministic.  Word-alignment may be slow and-or blow up in memory.
WARNING ([5.2.64~1-2fbf2]:LatticeWordAligner():word-align-lattice.cc:263) [Lattice has input epsilons and/or is not input-deterministic (in Mohri sense)]-- i.e. lattice is not deterministic.  Word-alignment may be slow and-or blow up in memory.
WARNING ([5.2.64~1-2fbf2]:LatticeWordAligner():word-align-lattice.cc:263) [Lattice has input epsilons and/or is not input-deterministic (in Mohri sense)]-- i.e. lattice is not deterministic.  Word-alignment may be slow and-or blow up in memory.
huh i hello this is hello dimitri listen i i can't hear too well do you support you could turn the music down just a little
Caught SIGSEGV

Is that normal (given the Kaldi warnings)? I have the same behavior with the streaming service (the worker hangs) when do-phone-alignment is set to "true". Is there anything I can do in Kaldi to prevent/improve this?

arielvsp avatar Aug 26 '17 20:08 arielvsp

I investigated this problem, the thing is that you can't call WordAlignLattice twice, the first run replaces silences with epsilons, so the second run emits a warning. A similar problem fix is here:

https://github.com/alphacep/vosk-api/commit/558b4dd69e75e7f5d0644c5221302b6035cbfe99

nshmyrev avatar Jun 24 '21 18:06 nshmyrev