sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

Blank results from online-websocket-client-microphone.py

Open OswaldoBornemann opened this issue 1 year ago • 12 comments

I tried to use the python-api-examples/online-websocket-client-microphone.py when i started the sherpa-onnx-online-websocket-server already, but i got the blank results.

Started! Please speak
{"is_final":false, "segment":0, "start_time":0.00, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":0.00, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":2.56, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":2.56, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":5.12, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":5.12, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":7.68, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":7.68, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":10.24, "text": "", "timestamps": [], "tokens":[]}
{"is_final":true, "segment":0, "start_time":10.24, "text": "", "timestamps": [], "tokens":[]}
{"is_final":false, "segment":0, "start_time":12.80, "text": "", "timestamps": [], "tokens":[]}

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

What did you say after starting python-api-examples/online-websocket-client-microphone.py and what do you expect from the returned result?

csukuangfj avatar Dec 15 '23 03:12 csukuangfj

Sorry if my explanation was not very clear. What I meant is, when I launched the sherpa-onnx-online-websocket-server, I observed that this service was already up and running in the background. Then, I tried to initiate the client service using python-api-examples/online-websocket-client-microphone.py because I wanted to utilize my computer's recording capabilities for real-time speech recognition. However, when I spoke, I noticed that the client returned empty results.

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

》 when I launched the sherpa-onnx-online-websocket-server

Which model are you using?

However, when I spoke

Did you speak English and the server is using an English model?

csukuangfj avatar Dec 15 '23 03:12 csukuangfj

》 when I launched the sherpa-onnx-online-websocket-server

Which model are you using?

However, when I spoke

Did you speak English and the server is using an English model?

I used the model named sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12. And i speak Chinese as well.

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

Similarly, I also tried the same functionality in Python, following the guide at https://k2-fsa.github.io/sherpa/onnx/websocket/online-websocket.html#start-the-client-python-with-microphone.

It appeared to have started successfully, but when I spoke, there was no output of any kind.

Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
   in_sample_rate: 48000
   output_sample_rate: 16000

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

Please post the complete command about how you start the server.

csukuangfj avatar Dec 15 '23 03:12 csukuangfj

Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.

csukuangfj avatar Dec 15 '23 03:12 csukuangfj

Please post the complete command about how you start the server.

I see. So this is the command how I start the server.

(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \
--tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \
--encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx
  0 DELL U2422HX, Core Audio (0 in, 2 out)
  1 iPhone Microphone, Core Audio (1 in, 0 out)
> 2 MacBook Pro Microphone, Core Audio (1 in, 0 out)
< 3 MacBook Pro Speakers, Core Audio (0 in, 2 out)
  4 Microsoft Teams Audio, Core Audio (2 in, 2 out)
Use default device: MacBook Pro Microphone
Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
   in_sample_rate: 48000
   output_sample_rate: 16000

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.

Okay. I will give it a try.

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

Also, please test it with https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py which does not use a server or a client. It makes the debug easier.

Okay. I will give it a try.

The result seems the same as speech-recognition-from-microphone.py. I think the problem might be the microphone or the streaming input. I will check it.

OswaldoBornemann avatar Dec 15 '23 03:12 OswaldoBornemann

Please post the complete command about how you start the server.

I see. So this is the command how I start the server.

(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \
--tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \
--encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx
  0 DELL U2422HX, Core Audio (0 in, 2 out)
  1 iPhone Microphone, Core Audio (1 in, 0 out)
> 2 MacBook Pro Microphone, Core Audio (1 in, 0 out)
< 3 MacBook Pro Speakers, Core Audio (0 in, 2 out)
  4 Microsoft Teams Audio, Core Audio (2 in, 2 out)
Use default device: MacBook Pro Microphone
Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
   in_sample_rate: 48000
   output_sample_rate: 16000

This command works perfectly on my side. Please check your microphone.

By the way, you can use it to decode files. If it works, then there must be issues with your microphone.

csukuangfj avatar Dec 15 '23 04:12 csukuangfj

Please post the complete command about how you start the server.

I see. So this is the command how I start the server.

(base) MacBook-Pro sherpa-onnx % python python-api-examples/speech-recognition-from-microphone.py \
--tokens=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt \
--encoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--decoder=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx \
--joiner=./sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx
  0 DELL U2422HX, Core Audio (0 in, 2 out)
  1 iPhone Microphone, Core Audio (1 in, 0 out)
> 2 MacBook Pro Microphone, Core Audio (1 in, 0 out)
< 3 MacBook Pro Speakers, Core Audio (0 in, 2 out)
  4 Microsoft Teams Audio, Core Audio (2 in, 2 out)
Use default device: MacBook Pro Microphone
Started! Please speak
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/features.cc:AcceptWaveformImpl:89 Creating a resampler:
   in_sample_rate: 48000
   output_sample_rate: 16000

This command works perfectly on my side. Please check your microphone.

By the way, you can use it to decode files. If it works, then there must be issues with your microphone.

Yeah I think so. I will check the microphone to see what happened.

OswaldoBornemann avatar Dec 15 '23 05:12 OswaldoBornemann