VoiceStreamAI issues

Client disconnects , and refreshes after the audio stream is recorded and sent to the server

5

hello @alesaccoia , thanks for the work on VoiceStreamAI , I was trying to make this work but , after setting the config and successfully establishing the websocket connection ,...

qxprakash

refactor code using black formatter and add dockerhub push cmd, and update base image to version containing cudnn8

6

FrostFlowerFairy

hallucinated words in the output

10

While running the program during silence periods irrelevant output like "okay" and "thank you" appears. Is there a way to fix this or is it a feature of faster-whisper

noah-george

Question about buffering strategies

I am using this code, but I have a question. src/buffering_strategy/buffering_strategies.py line:87 I can't understand why the following code is needed? `self.client.buffer.clear()` Please help me. Thank you.

AI-General

Remote dev container

3

xuwenhao

Samples Width defnition

Wondering if the documentation of samples_width is correct here. https://github.com/alesaccoia/VoiceStreamAI/blob/465403b7039d1f54ba6b8d69c69c40b55bf300c1/src/server.py#L25 Because when calculating for buffer and scratch_buffer, there is no division by 8. https://github.com/alesaccoia/VoiceStreamAI/blob/465403b7039d1f54ba6b8d69c69c40b55bf300c1/src/buffering_strategy/buffering_strategies.py#L116-L118 https://github.com/alesaccoia/VoiceStreamAI/blob/465403b7039d1f54ba6b8d69c69c40b55bf300c1/src/buffering_strategy/buffering_strategies.py#L73-L77

kenho211

Unnecessary whitespace in Faster Whisper ASR output for Chinese

2

https://github.com/alesaccoia/VoiceStreamAI/assets/39730824/c86907e4-f6df-4bce-bf75-090ee34f7384

jinmiaoluo

how can I deploy with my local model (e.g. faster_whisper_large_v3)?

3

as title

ruifengma

audio from video element?

5

how to use audio chunks from a video element and transcribe it? thanks

ROBERT-MCDOWELL

question

Refactor UI to support mobile devices

Current UI: https://github.com/alesaccoia/VoiceStreamAI/assets/39730824/98cef236-7ed5-474c-b184-b082b8526df9 New UI: https://github.com/alesaccoia/VoiceStreamAI/assets/39730824/c997744b-63c9-4bc2-ad62-a537a06a8d96

jinmiaoluo

VoiceStreamAI
VoiceStreamAI copied to clipboard

Metadata

Client disconnects , and refreshes after the audio stream is recorded and sent to the server

refactor code using black formatter and add dockerhub push cmd, and update base image to version containing cudnn8

hallucinated words in the output

Question about buffering strategies

Remote dev container

Samples Width defnition

Unnecessary whitespace in Faster Whisper ASR output for Chinese

how can I deploy with my local model (e.g. faster_whisper_large_v3)?

audio from video element?

Refactor UI to support mobile devices

← Metadata

Owner

Metadata

VoiceStreamAI VoiceStreamAI copied to clipboard

Metadata

← Metadata

Owner

Metadata

VoiceStreamAI
VoiceStreamAI copied to clipboard