unimrcp-vosk-plugin icon indicating copy to clipboard operation
unimrcp-vosk-plugin copied to clipboard

Issue with dtmf recognition in using unimrcp vosk plugin

Open vkolluru76 opened this issue 3 years ago • 6 comments

Hi I have been working on freeswitch / unimrcp / kaldi stack using vosk server plugin. The speech recognition is working flawalessly however I am running into issue with dtmf recognition. I always get 002 no input timeout as my termination cause. I can replicate the issue using freeswitch or using the umc client and they both behave exactly same. In the unimrcp server logs, it shows the start and end events are being recognized for the each digit however the utterance files are empty. It appears as if the data is not being streamed to the mrcp server.

Here are the log files from unimrcpserver log

MRCP/2.0 203 RECOGNIZE 1 Channel-Identifier: 465b528207ee48df@speechrecog Content-Id: request1@form-level Content-Type: text/uri-list Cancel-If-Queue: false Content-Length: 19

builtin:dtmf/digits 2022-02-18 03:26:44:350231 [INFO] Assign Control Channel 465b528207ee48df@speechrecog to Connection 127.0.1.1:1544 <-> 127.0.0.1:33236 [0] -> [1] 2022-02-18 03:26:44:350563 [INFO] Process RECOGNIZE Request 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:44:350856 [INFO] Open Utterance Output File [/usr/local/unimrcp/var/utter-8kHz-465b528207ee48df.pcm] for Writing LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.00898099 seconds in looped compilation. 2022-02-18 03:26:44:409612 [INFO] Process RECOGNIZE Response 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:44:409648 [INFO] State Transition IDLE -> RECOGNIZING 465b528207ee48df@speechrecog 2022-02-18 03:26:44:409722 [INFO] Send MRCPv2 Data 127.0.1.1:1544 <-> 127.0.0.1:33236 [83 bytes] MRCP/2.0 83 1 200 IN-PROGRESS Channel-Identifier: 465b528207ee48df@speechrecog

2022-02-18 03:26:44:498216 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:1 2022-02-18 03:26:44:541509 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:1 duration:560 ts 2022-02-18 03:26:44:602958 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:2 2022-02-18 03:26:44:660659 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:2 duration:560 ts 2022-02-18 03:26:44:724671 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:3 2022-02-18 03:26:44:782711 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:3 duration:560 ts 2022-02-18 03:26:44:843399 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:4 2022-02-18 03:26:44:901544 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:4 duration:560 ts 2022-02-18 03:26:49:444932 [INFO] Detected Noinput 465b528207ee48df@speechrecog 2022-02-18 03:26:49:445073 [INFO] Process RECOGNITION-COMPLETE Event 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:49:445091 [INFO] State Transition RECOGNIZING -> RECOGNIZED 465b528207ee48df@speechrecog 2022-02-18 03:26:49:445184 [INFO] Send MRCPv2 Data 127.0.1.1:1544 <-> 127.0.0.1:33236 [138 bytes] MRCP/2.0 138 RECOGNITION-COMPLETE 1 COMPLETE Channel-Identifier: 465b528207ee48df@speechrecog Completion-Cause: 002 no-input-timeout

I have tried matching the codecs between both client and server but that didn't make any difference. let me know if you need any other debug information.

vkolluru76 avatar Feb 18 '22 03:02 vkolluru76