vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Converting audio with java

Open NicklasMatzulla opened this issue 2 years ago • 8 comments

Good day, I've been trying all day to convert audio so that Vosk can work with it. The default audio format I get (from Discord) is the following:

I want to convert this to Vosk format, but unfortunately I don't know exactly how. Can anyone help me? Best would be a code snippet, I have to admit that audio processing is not my thing :/ AudioFormat OUTPUT_FORMAT = new AudioFormat(48000.0F, 16, 2, true, true);

Here is also my code: https://just-paste.it/FB0TyErtQP

Kind regards Nicklas

NicklasMatzulla avatar Jun 26 '22 23:06 NicklasMatzulla

It has to be 1 instead of 2 (mono, not stereo).

nshmyrev avatar Jun 26 '22 23:06 nshmyrev

It has to be 1 instead of 2 (mono, not stereo).

Hey, this is the audio fomat of discord, do you know how I can covert it?

ghost avatar Jun 26 '22 23:06 ghost

Something like

https://stackoverflow.com/a/50657152

nshmyrev avatar Jun 26 '22 23:06 nshmyrev

I modified my source code, but it still not works: https://just-paste.it/0O65WQXkAB

ghost avatar Jun 26 '22 23:06 ghost

AudioFormat OUTPUT_FORMAT = new AudioFormat(48000.0F, 16, 2, true, true); should be probably true, false since the data is usually little endian.

nshmyrev avatar Jun 26 '22 23:06 nshmyrev

AudioFormat OUTPUT_FORMAT = new AudioFormat(48000.0F, 16, 2, true, true); is the audio format from discord, this is not my audio format. I get the data in this format.

ghost avatar Jun 27 '22 00:06 ghost

AudioFormat OUTPUT_FORMAT = new AudioFormat(48000.0F, 16, 2, true, true); should be probably true, false since the data is usually little endian.

Hello, I guess I have to apologize, I did not read correctly. I have now adjusted the value, the speech recognition now works without problems. Nevertheless, a new problem occurs, namely, it does not recognize getResult, but getPartialResult does. Normally it should end after a sentence, no matter how long I wait, it does not output a getResult and hangs everything on each other. Maybe you can help me further.

Here is also my curent source code: https://just-paste.it/R0VJTcJlcO

ghost avatar Jun 27 '22 00:06 ghost

You need to dump bytes you feed to recognizer and share the result.

nshmyrev avatar Jun 27 '22 07:06 nshmyrev