WhisperLiveKit bug: audio detection and file upload limitations

WhisperLiveKit is experiencing two key limitations:

Computer Audio Detection Failure
- When attempting to record audio playing from the computer, the application fails to detect system audio
- Voice recognition works correctly when speaking directly into the microphone
File Upload Restriction
- Unable to upload voice recordings to localhost

Steps to Reproduce

Play audio from computer
Attempt to record using WhisperLiveKit
Observe lack of audio detection from system sound

Expected Behavior

Detect and transcribe audio from computer sources
Allow uploading of voice recordings to localhost

Actual Behavior

System audio not recognized during recording
File upload functionality unavailable

Additional Context

Microphone direct input works correctly
Requires investigation into system audio capture mechanisms
Needs implementation of file upload feature

Potential Solutions

Investigate system audio capture libraries
Implement file upload endpoint for localhost

Aug 31 '25 15:08 venturero

Hello Venturero, To be able to record system audio you need some kind of loopback device to present the output audio as an input device. This is a general OS/conventional issue, not specifically related to WLK.

Some audio interfaces provides this natively, most professional audio interfaces do, and some consumer devices also under different names like "Stereo Mix" or "What U Hear", on Windows at least.

If you are running Windows or Mac and lack such an option, then look into the excellent VB-Audio Cable, it creates a virtual sound device for this task and I'm using it successfully with WLK. With this said, it would be nice to see a direct stream sink for in WLK other than websockets. Since FFMPEG is already used internally to convert incoming audio, it should be a relatively small addition to start a custom listening instance on a different port with a command switch. If I'd have the spare time, I'd look into it myself but I don't have that luxury at the moment.

As for your second limitation regarding file upload, the name of the project is quite telling, Whisper LIVE Kit. I would say it makes little sense at this stage to add offline functionality to this project when there are tons of implementations of Whisper that does what you ask for, not to mention the original project from OpenAI. IMO the maintainer of and contributors to this project should definitely focus on improving the live-aspect, which is its unique strength.

BR Alexander

Sep 08 '25 12:09 Alexander-ARTV

有一个方法，可以让浏览器共享电脑内容包括音频

Oct 10 '25 12:10 XjiangSail