unimrcp-vosk-plugin
unimrcp-vosk-plugin copied to clipboard
Issue with dtmf recognition in using unimrcp vosk plugin
Hi I have been working on freeswitch / unimrcp / kaldi stack using vosk server plugin. The speech recognition is working flawalessly however I am running into issue with dtmf recognition. I always get 002 no input timeout as my termination cause. I can replicate the issue using freeswitch or using the umc client and they both behave exactly same. In the unimrcp server logs, it shows the start and end events are being recognized for the each digit however the utterance files are empty. It appears as if the data is not being streamed to the mrcp server.
Here are the log files from unimrcpserver log
MRCP/2.0 203 RECOGNIZE 1 Channel-Identifier: 465b528207ee48df@speechrecog Content-Id: request1@form-level Content-Type: text/uri-list Cancel-If-Queue: false Content-Length: 19
builtin:dtmf/digits 2022-02-18 03:26:44:350231 [INFO] Assign Control Channel 465b528207ee48df@speechrecog to Connection 127.0.1.1:1544 <-> 127.0.0.1:33236 [0] -> [1] 2022-02-18 03:26:44:350563 [INFO] Process RECOGNIZE Request 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:44:350856 [INFO] Open Utterance Output File [/usr/local/unimrcp/var/utter-8kHz-465b528207ee48df.pcm] for Writing LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.00898099 seconds in looped compilation. 2022-02-18 03:26:44:409612 [INFO] Process RECOGNIZE Response 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:44:409648 [INFO] State Transition IDLE -> RECOGNIZING 465b528207ee48df@speechrecog 2022-02-18 03:26:44:409722 [INFO] Send MRCPv2 Data 127.0.1.1:1544 <-> 127.0.0.1:33236 [83 bytes] MRCP/2.0 83 1 200 IN-PROGRESS Channel-Identifier: 465b528207ee48df@speechrecog
2022-02-18 03:26:44:498216 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:1 2022-02-18 03:26:44:541509 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:1 duration:560 ts 2022-02-18 03:26:44:602958 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:2 2022-02-18 03:26:44:660659 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:2 duration:560 ts 2022-02-18 03:26:44:724671 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:3 2022-02-18 03:26:44:782711 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:3 duration:560 ts 2022-02-18 03:26:44:843399 [INFO] Detected Start of Event 465b528207ee48df@speechrecog id:4 2022-02-18 03:26:44:901544 [INFO] Detected End of Event 465b528207ee48df@speechrecog id:4 duration:560 ts 2022-02-18 03:26:49:444932 [INFO] Detected Noinput 465b528207ee48df@speechrecog 2022-02-18 03:26:49:445073 [INFO] Process RECOGNITION-COMPLETE Event 465b528207ee48df@speechrecog [1] 2022-02-18 03:26:49:445091 [INFO] State Transition RECOGNIZING -> RECOGNIZED 465b528207ee48df@speechrecog 2022-02-18 03:26:49:445184 [INFO] Send MRCPv2 Data 127.0.1.1:1544 <-> 127.0.0.1:33236 [138 bytes] MRCP/2.0 138 RECOGNITION-COMPLETE 1 COMPLETE Channel-Identifier: 465b528207ee48df@speechrecog Completion-Cause: 002 no-input-timeout
I have tried matching the codecs between both client and server but that didn't make any difference. let me know if you need any other debug information.