vosk-api
vosk-api copied to clipboard
Runtime: Rebuilding repository / Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes
Edit: just saw there was a new release with "Fixes for lattice construction". Will try that and report back.
Hello there!
I'm running parallel transcription jobs using one recognizer instance per job. More or less 3% of the jobs are returning bad transcriptions (usually "the the the" or some variation of it). When it happens, I get messages like this:
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39004704,169056,10889208), after rebuilding, repo size was 30202176, effective beam was 5.9772 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39078112,166560,10761912), after rebuilding, repo size was 30382048, effective beam was 5.96487 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39287392,165152,10632816), after rebuilding, repo size was 30607040, effective beam was 5.96444 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39300192,162400,10546344), after rebuilding, repo size was 30658624, effective beam was 5.92997 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39311520,160800,10561272), after rebuilding, repo size was 30832672, effective beam was 5.87253 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39380960,162432,10497984), after rebuilding, repo size was 30972608, effective beam was 5.84342 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39416736,157696,10433400), after rebuilding, repo size was 31099904, effective beam was 5.82206 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39511712,157504,10508424), after rebuilding, repo size was 31215648, effective beam was 5.80311 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39478528,155264,10420152), after rebuilding, repo size was 31215904, effective beam was 5.77253 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39833312,153440,10056048), after rebuilding, repo size was 31604736, effective beam was 5.80277 vs. requested beam 6
Is this a known bug / limitation or I'm doing something wrong?
I was wondering if there's a way I can check for this behavior before or after calling vosk_recognizer_final_result
so I can retry the transcription job.
Thanks!
It would be help if you provide particular audio chunk for analysis.
The issue here can be because of music in the audio or some other noise. Last changes should not help with them, it is more generic problem.
Hey @nshmyrev, thanks for replying.
Unfortunately, it's not reproducible using the same audio chunk. After upgrading to the latest binary, issue seems to be gone, but I haven't exhaustively tested it.
I'll get back here with the final outcome.
It indeed didn't solve the problem. I can see that there's some memory being leaked, will investigate where this is coming from.
In addition, when the system is misbehaving, I also get this log message (in addition to the ones in the original post):
WARNING (VoskAPI:LinearCgd():optimization.cc:549) Doing linear CGD in dimension 100, after 15 iterations the squared residual has got worse, 3.76336 > 3.19428. Will do an exact optimization.