sherpa-onnx
sherpa-onnx copied to clipboard
Blank results from online-websocket-client-microphone.py
I tried to use the python-api-examples/online-websocket-client-microphone.py
when i started the sherpa-onnx-online-websocket-server
already, but i got the blank results.
Started! Please speak
{"is_final":false, "segment":0, "start_time":0.00, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":0.00, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":2.56, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":2.56, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":5.12, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":5.12, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":7.68, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":7.68, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":10.24, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":10.24, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":12.80, "text": "", "timestamps": [], "tokens":[]}
What did you say after starting python-api-examples/online-websocket-client-microphone.py
and what do you expect from the returned result?
Sorry if my explanation was not very clear. What I meant is, when I launched the sherpa-onnx-online-websocket-server
, I observed that this service was already up and running in the background. Then, I tried to initiate the client service using python-api-examples/online-websocket-client-microphone.py
because I wanted to utilize my computer's recording capabilities for real-time speech recognition. However, when I spoke, I noticed that the client returned empty results.
》 when I launched the sherpa-onnx-online-websocket-server
Which model are you using?
However, when I spoke
Did you speak English and the server is using an English model?
》 when I launched the sherpa-onnx-online-websocket-server
Which model are you using?
However, when I spoke
Did you speak English and the server is using an English model?
I used the model named sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12
. And i speak Chinese as well.
Similarly, I also tried the same functionality in Python, following the guide at https://k2-fsa.github.io/sherpa/onnx/websocket/online-websocket.html#start-the-client-python-with-microphone.
It appeared to have started successfully, but when I spoke, there was no output of any kind.
Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
in_sample_rate: 48000
output_sample_rate: 16000
Please post the complete command about how you start the server.
Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.
Please post the complete command about how you start the server.
I see. So this is the command how I start the server.
(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \
--tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \
--encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx
0 DELL U2422HX, Core Audio (0 in, 2 out)
1 iPhone Microphone, Core Audio (1 in, 0 out)
> 2 MacBook Pro Microphone, Core Audio (1 in, 0 out)
< 3 MacBook Pro Speakers, Core Audio (0 in, 2 out)
4 Microsoft Teams Audio, Core Audio (2 in, 2 out)
Use default device: MacBook Pro Microphone
Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
in_sample_rate: 48000
output_sample_rate: 16000
Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.
Okay. I will give it a try.
Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.
Okay. I will give it a try.
The result seems the same as speech-recognition-from-microphone.py
. I think the problem might be the microphone or the streaming input. I will check it.
Please post the complete command about how you start the server.
I see. So this is the command how I start the server.
(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \ --tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \ --encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \ --decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \ --joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx 0 DELL U2422HX, Core Audio (0 in, 2 out) 1 iPhone Microphone, Core Audio (1 in, 0 out) > 2 MacBook Pro Microphone, Core Audio (1 in, 0 out) < 3 MacBook Pro Speakers, Core Audio (0 in, 2 out) 4 Microsoft Teams Audio, Core Audio (2 in, 2 out) Use default device: MacBook Pro Microphone Started! Please speak /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler: in_sample_rate: 48000 output_sample_rate: 16000
This command works perfectly on my side. Please check your microphone.
By the way, you can use it to decode files. If it works, then there must be issues with your microphone.
Please post the complete command about how you start the server.
I see. So this is the command how I start the server.
(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \ --tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \ --encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \ --decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \ --joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx 0 DELL U2422HX, Core Audio (0 in, 2 out) 1 iPhone Microphone, Core Audio (1 in, 0 out) > 2 MacBook Pro Microphone, Core Audio (1 in, 0 out) < 3 MacBook Pro Speakers, Core Audio (0 in, 2 out) 4 Microsoft Teams Audio, Core Audio (2 in, 2 out) Use default device: MacBook Pro Microphone Started! Please speak /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler: in_sample_rate: 48000 output_sample_rate: 16000
This command works perfectly on my side. Please check your microphone.
By the way, you can use it to decode files. If it works, then there must be issues with your microphone.
Yeah I think so. I will check the microphone to see what happened.