vosk-api
vosk-api copied to clipboard
test_gradio.py with vosk not working
Hello, I want to present vosk on the gradio service using the Turkish model, but I could not achieve this.
I can run the code but it doesn't write anything on the gradio screen.
When I examine the code a bit, I can actually get the sound in bytes, but rec.Result returns empty.
('\n ', (<vosk.KaldiRecognizer object at 0x000002543D113760>, []))
i am making the sample code with you can you help please;
import json
import gradio as gr
from vosk import KaldiRecognizer, Model
model = Model(r"C:\Users\Administrator\PycharmProjects\vosk\model\vosk-model-small-tr-0.3\vosk-model-small-tr-0.3")
def transcribe(data, state):
sample_rate, audio_data = data
audio_data = (audio_data >> 16).astype("int16").tobytes()
if state is None:
rec = KaldiRecognizer(model, sample_rate)
result = []
else:
rec, result = state
if rec.AcceptWaveform(audio_data):
text_result = json.loads(rec.Result())["text"]
if text_result != "":
result.append(text_result)
partial_result = ""
else:
partial_result = json.loads(rec.PartialResult())["partial"] + " "
return "\n".join(result) + "\n" + partial_result, (rec, result)
gr.Interface(
fn=transcribe,
inputs=[
gr.Audio(source="microphone", type="numpy", streaming=True),
"state"
],
outputs=[
"textbox",
"state"
],
live=True).launch(share=True)
Please click on "Edit" and format your post properly
I edited but I don't understand exactly what you mean @nshmyrev
Still needs edits. You can check here: https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code
I've edited, I request you to solve the problem. @nshmyrev
Chrome browser doesn't support audio recording on localhost, do you try to run it on your local machine or remote server? Do you access remote over https? Did you enable audio recording on localhost as in here:
https://stackoverflow.com/questions/16835421/how-to-allow-chrome-to-access-my-camera-on-localhost
Yes, I am running it locally. I made the chrome setting as below, but it still did not resolve. I even tried another browser and it still didn't work. When I edit the code, it seems like it can't convert the binary text. As a result, my share data text becomes null as a result, but the left object is full.
Also, the code I have shared now is not in the browser, but it works in the console. The conversion logic is a little different here, I want it to work on gradio as if it were a web server. @nshmyrev
It is still a permission issue so you get no audio. You need to check javascript console log.
Hi, I tried to use the vosk with Gradio, and its not working. Speech is not being recognized.
Chrome setting
Web console.
I change “audio_data = (audio_data >> 16).astype("int16").tobytes()” to "audio_data=audio_data.astype("int16").tobytes()", then it works.