vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Runtime: Rebuilding repository / Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes

Open andrestone opened this issue 1 year ago • 3 comments

Edit: just saw there was a new release with "Fixes for lattice construction". Will try that and report back.

Hello there!

I'm running parallel transcription jobs using one recognizer instance per job. More or less 3% of the jobs are returning bad transcriptions (usually "the the the" or some variation of it). When it happens, I get messages like this:

LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39004704,169056,10889208), after rebuilding, repo size was 30202176, effective beam was 5.9772 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39078112,166560,10761912), after rebuilding, repo size was 30382048, effective beam was 5.96487 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39287392,165152,10632816), after rebuilding, repo size was 30607040, effective beam was 5.96444 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39300192,162400,10546344), after rebuilding, repo size was 30658624, effective beam was 5.92997 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39311520,160800,10561272), after rebuilding, repo size was 30832672, effective beam was 5.87253 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39380960,162432,10497984), after rebuilding, repo size was 30972608, effective beam was 5.84342 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39416736,157696,10433400), after rebuilding, repo size was 31099904, effective beam was 5.82206 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39511712,157504,10508424), after rebuilding, repo size was 31215648, effective beam was 5.80311 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39478528,155264,10420152), after rebuilding, repo size was 31215904, effective beam was 5.77253 vs. requested beam 6
LOG (VoskAPI:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
WARNING (VoskAPI:CheckMemoryUsage():determinize-lattice-pruned.cc:316) Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (39833312,153440,10056048), after rebuilding, repo size was 31604736, effective beam was 5.80277 vs. requested beam 6

Is this a known bug / limitation or I'm doing something wrong?

I was wondering if there's a way I can check for this behavior before or after calling vosk_recognizer_final_result so I can retry the transcription job.

Thanks!

andrestone avatar Aug 31 '22 07:08 andrestone

It would be help if you provide particular audio chunk for analysis.

The issue here can be because of music in the audio or some other noise. Last changes should not help with them, it is more generic problem.

nshmyrev avatar Aug 31 '22 08:08 nshmyrev

Hey @nshmyrev, thanks for replying.

Unfortunately, it's not reproducible using the same audio chunk. After upgrading to the latest binary, issue seems to be gone, but I haven't exhaustively tested it.

I'll get back here with the final outcome.

andrestone avatar Aug 31 '22 10:08 andrestone

It indeed didn't solve the problem. I can see that there's some memory being leaked, will investigate where this is coming from.

In addition, when the system is misbehaving, I also get this log message (in addition to the ones in the original post):

WARNING (VoskAPI:LinearCgd():optimization.cc:549) Doing linear CGD in dimension 100, after 15 iterations the squared residual has got worse, 3.76336 > 3.19428.  Will do an exact optimization.

andrestone avatar Aug 31 '22 12:08 andrestone