SurfSense icon indicating copy to clipboard operation
SurfSense copied to clipboard

Feature Request: Add voice chat mode with interruption support

Open iye opened this issue 4 months ago • 5 comments

I'd like to request support for a voice chat mode in SurfSense where the user can speak a prompt and receive an audio response back, enabling full speech-to-speech interaction.

My suggestions for this feature:

Add a real-time voice chat mode (speech-to-speech). Include support for user-defined stop words (such as "stop" or "cancel") that can interrupt TTS playback while it's speaking, to allow more natural and hands-free interaction. Ideally, let users toggle this mode on or off through the UI or config.

One possible implementation path would be to leverage WebRTC for capturing and streaming audio. WebRTC includes built-in Voice Activity Detection (VAD), which can be used to automatically detect when the user starts or stops speaking, enabling natural interruption of TTS and seamless hands-free interaction. In addition to VAD, WebRTC also provides support for low-latency audio transmission, echo cancellation, noise suppression, automatic gain control, and cross-platform compatibility across browsers and mobile environments. These features make it a strong candidate for implementing a responsive and privacy-preserving voice chat mode, especially when combined with local LLM and TTS/STT components.

Although webrtc detects user speech, noisy environments can trigger a stop in TTS so the option of stop words is still useful.

I believe Livekit is using webrtc

Looking forward to seeing this added.

iye avatar Aug 21 '25 17:08 iye

Can i work on this issue?

iamsyg avatar Oct 19 '25 05:10 iamsyg

@iamsyg Sure. Thanks for your interest 👍

MODSetter avatar Oct 19 '25 20:10 MODSetter

Please could you label this issue as Hacktoberfest?

iamsyg avatar Oct 20 '25 12:10 iamsyg

@iamsyg Done 👍

MODSetter avatar Oct 20 '25 18:10 MODSetter

@MODSetter Thank you

iamsyg avatar Oct 21 '25 06:10 iamsyg