feat: implement bidirectional microphone pass-through
Description
This PR implements complete bidirectional microphone support for the Sunshine streaming server, enabling Moonlight clients to send their microphone audio back to the server for output through the host's speakers/headphones.
This addresses the long-standing feature request for microphone pass-through that has been requested by the community for over 5 years, solving a critical gap in the streaming ecosystem.
Key Implementation Details:
- Added new packet types (
IDX_MIC_DATA,IDX_MIC_CONFIG) for microphone data transmission - Implemented dedicated microphone stream on port 12 (
MIC_STREAM_PORT) for client-to-server audio - Created cross-platform audio output infrastructure with platform-specific implementations
- Integrated RTSP protocol extensions for automatic microphone capability advertisement
- Added comprehensive configuration options (
enable_mic_passthrough,mic_sink)
Screenshot
N/A - This is a server-side protocol and audio infrastructure implementation without UI changes.
Issues Fixed or Closed
- Resolves https://github.com/LizardByte/roadmap/issues/56
Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Dependency update (updates to dependencies)
- [ ] Documentation update (changes to documentation)
- [ ] Repository update (changes to repository files, e.g.
.github/...)
Checklist
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have added or updated the in code docstring/documentation-blocks for new or existing methods/components
Technical Implementation Details
Protocol Extensions
- Extended
stream.hwithMIC_STREAM_PORT = 12 - Added
socket_e::microphonefor dedicated mic socket handling - New packet types in protocol for microphone data and configuration
Audio Processing Pipeline
audio::mic_receive()function for processing incoming microphone packets- Opus decoder integration for real-time audio processing
mic_output_tinterface for platform-specific audio output
Platform Support
- Linux:
mic_output_pa_tusing PulseAudio for audio output - Windows:
mic_output_wasapi_tusing WASAPI for low-latency audio - macOS:
av_mic_output_tusing AVFoundation framework
Network Infrastructure
micReceiveThread()for UDP packet reception on port 12- Proper socket binding and management in broadcast context
- Integration with existing session and thread management
Configuration
Add to Sunshine configuration:
enable_mic_passthrough=true
mic_sink=default # or specify audio device name
Testing
The implementation has been validated for:
- ✅ Syntax and compilation compatibility
- ✅ Cross-platform code structure
- ✅ Integration with existing audio system
- ✅ Configuration parsing and validation
- ✅ Network infrastructure setup
Dependencies
No new external dependencies required. Uses existing:
- Opus codec (already present for audio streaming)
- Platform audio APIs (PulseAudio, WASAPI, AVFoundation)
- Existing network and threading infrastructure
Notes for Reviewers
This is a server-side implementation. Corresponding Moonlight client changes would be needed to complete the bidirectional audio feature. The protocol extensions are designed to be backward-compatible with existing clients.
The implementation follows existing Sunshine patterns for:
- Configuration management (
config.h/config.cpp) - Platform abstraction (
platform/common.h) - Network protocols (
stream.cpp,rtsp.cpp) - Audio processing (
audio.h/audio.cpp)
Breaking Changes
None. This feature is entirely additive and disabled by default.
Author: [email protected]
@cardoza1991 thank you for the PR! There has been a lot of talk about this feature as of late.
Would you mind editing the PR body to use our template? You can get the original template from here: https://github.com/LizardByte/.github/blob/master/.github/pull_request_template.md?plain=1
I think the approach is generally good, but I don't think we need any changes to the control stream. I think we should do all the configuration via RTSP/SDP. The server can advertise mic support via SDP like you're doing here. If the client support mic, then they can send an RTSP PLAY for the mic stream and that will tell Sunshine to expect microphone input.
We should also encrypt the microphone packets using AES-GCM like we do with control stream traffic.
My 2 cents regarding the protocol:
- Encryption should be (optionally) supported
- FEC should be (optionally) supported, or just sending duplicate UDP packets spread around in time I guess
- Multi-client streaming should be supported, e.g. client-identifying packet header outside of encrypted payload
- Mic packet's header+payload should be sufficiently different from ping packets (which are 20 bytes in length). The motivation is to make ping port capable of accepting mic packets too. Currently moonlight/sunshine protocol requires only 2 port numbers to operate in full capacity, and I will be extremely thankful if it stays that way.
I think we should do all the configuration via RTSP/SDP. The server can advertise mic support via SDP like you're doing here. If the client support mic, then they can send an RTSP PLAY for the mic stream and that will tell Sunshine to expect microphone input.
I believe midstream mic hotplug is a thing that should be supported (at least on the protocol level), and this can be implemented either through Control or Encrypted RTSP. But doing it through Control is probably easier.
@ABeltramo you would probably want to have a look at this too before anything gets finalized.
Thanks for the ping @ns6089 I agree with most of what has been said so far.
I think the protocol should be reversed though: a client advertises for a ~~microphone~~ generic audio input source, and we create the correct audio sink that matches the requested bitrate+channels on the host. Why would it be hardcoded and advertised from the server? This doesn't feel right https://github.com/cardoza1991/Sunshine/blob/9f8dd8d0d88d76f962daea2a8b054c7e2eed9653/src/rtsp.cpp#L759-L766 why hard-coding some values like that?
Also, I wouldn't make the mistake of assuming a single global microphone stream.
Since we have the freedom to create this from scratch, let's support multi-users and multiple audio input devices (it doesn't have to be strictly just a microphone!) right from the start. If we put an identifier in the control packet header, we don't even need multiple ports for different input streams.
Really excited for this, thanks @cardoza1991 to get the ball rolling!
yeah well I figured I'd tackle a 5 year PR request so here it is. Thnks for the feedback
Data flow can probably be like this:
- During RTSP. Client announces support for generic mic pass-through and whether it wants mic encryption. Server assigns and gives client some session token (used for packet identification later on). Port number for incoming mic packets is also shared here. So is whether or not server accepted the request for encryption.
- During stream, in Control. Client announces mic creation, with a number unique to this client and some channel format.
- Client begins sending packets (to the port announced in RTSP). Each packet contains session token (provided during RTSP), mic number unique to this client, packet counter for this mic, and audio payload. Audio payload is encrypted with AES-GCM (if both sides agreed on supporting encryption during RTSP), this particular encryption algorithm also acts as a validator and protects from malicious packets.
- Optionally during stream, in Control. Client can announce mic destruction, for the particular mic number.
Communication in Control is kept intentionally unidirectional because it's painful to read async replies from it.
Everything doesn't have to be implemented at the same time, for example encryption can be easily delayed.
How can I contribute to this PR? Do I need to submit changes to cardoza1991's repo then have it moved here?
Thanks for the ping @ns6089 I agree with most of what has been said so far.
I think the protocol should be reversed though: a client advertises for a ~microphone~ generic audio input source, and we create the correct audio sink that matches the requested bitrate+channels on the host. Why would it be hardcoded and advertised from the server? This doesn't feel right https://github.com/cardoza1991/Sunshine/blob/9f8dd8d0d88d76f962daea2a8b054c7e2eed9653/src/rtsp.cpp#L759-L766 why hard-coding some values like that?
Also, I wouldn't make the mistake of assuming a single global microphone stream. Since we have the freedom to create this from scratch, let's support multi-users and multiple audio input devices (it doesn't have to be strictly just a microphone!) right from the start. If we put an identifier in the control packet header, we don't even need multiple ports for different input streams.
Really excited for this, thanks @cardoza1991 to get the ball rolling!
I threw in audio support for windows, Mac’s audio stuff and Linux. Honestly my primary focus should have just been Linux since me and the boys are streaming games with our servers.
How can I contribute to this PR? Do I need to submit changes to cardoza1991's repo then have it moved here?
If you submit a PR to this branch (https://github.com/cardoza1991/Sunshine/tree/feature/bidirectional-microphone-passthrough) and it gets accepted and merged, then it would be included in this PR.
If the changes you want are simple, it might be better to just do a review here (https://github.com/LizardByte/Sunshine/pull/4078/files)
I think the approach is generally good, but I don't think we need any changes to the control stream. I think we should do all the configuration via RTSP/SDP. The server can advertise mic support via SDP like you're doing here. If the client support mic, then they can send an RTSP PLAY for the mic stream and that will tell Sunshine to expect microphone input.
We should also encrypt the microphone packets using AES-GCM like we do with control stream traffic.
Awesome thanks for this
Quality Gate failed
Failed conditions
34 New issues
D Reliability Rating on New Code (required ≥ A)
2 New Bugs (required ≤ 0)
32 New Code Smells (required ≤ 0)
See analysis details on SonarQube Cloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarQube for IDE
@cardoza1991 question are you still working on this or is this PR stale?
Might be better to merge it in and keep it as experimental until it is perfected? It has been a highly requested feature for 5 years.
Might be better to merge it in and keep it as experimental until it is perfected? It has been a highly requested feature for 5 years.
I am asking cause i am considering to take a spin on this if they are no longer working on it
@MNarath1 it appears stale to me.
Is this topic dead again?