vosk-android-demo icon indicating copy to clipboard operation
vosk-android-demo copied to clipboard

Sampling rate

Open sidra718 opened this issue 3 years ago • 4 comments

Hello,

I am using AudioManager from Android studio to capture sound from bluetooth headset instead of the internal microphone of my tablet. However, the following restrictings apply on input streams:

  • The format must be mono
  • The sampling must be 8KHz

When I change the sampling rate from 16KHz to 8KHz when defining the model and speechService, the app crashes. Perhaps you might have any solution to help me?

sidra718 avatar Jan 20 '21 09:01 sidra718

This will be because the model is expecting (and trained on) 16kHz speech.

You could try upsampling the input from 8kHz to 16kHz on the app (I'd have no idea how), find a model trained on 8kHz (again, I don't know where from), or train one yourself.

OscarVanL avatar Jan 21 '21 20:01 OscarVanL

Yes, I figured that out. I am also not sure how to upsample the input or find a trained model on 8kHz. I don't have a problem with the model however. I just want to send my voice message through a bluetooth headset instead of the internal microphone of my tablet.

sidra718 avatar Jan 26 '21 14:01 sidra718

Sure, but unless you can receive input from your Bluetooth headset at 16kHz you're going to need to do either of those things. And I'm not sure either, I've not had to do it.

OscarVanL avatar Jan 26 '21 14:01 OscarVanL

You can add --allow-upsample=true in conf/mfcc.conf inside the model directory. This way 8kHz should also be accepted by the recognizer.

To do that it might be possible to simply unzip the aar archive to adjust conf/mfcc.conf accordingly. Your might also need to recreate the new md5 checksum (md5 assets/model-en-us/conf/mfcc.conf) and update the value inside sync/model-android/conf/mfcc.conf.md5.

mmende avatar Aug 10 '22 07:08 mmende