mumble
mumble copied to clipboard
Rewrite audio system using libcrossaudio
Context
The currently used audio system in the client is rather messy and has quite a few quirks that can't be easily fixed. On top of that the code itself is written in a pretty bad style that makes it rather hard to grasp what is going on.
Description
For this reason, the audio handling should be rewritten from scratch and while doing so, special attention should be paid to fixing the currently experienced audio issues.
Mumble component
Client
OS-specific?
No
Additional information
The current issues that need to be fixed are:
- #4883
- #4177, #1092
- #5314, #1089 - see also implementation attempt #171 and the revert in #4633
- #5379
- #3491 (especially https://github.com/mumble-voip/mumble/issues/3491#issuecomment-606189582): Try to use a different Bluetooth protocol (in some cases) to allow better audio quality
- #5577
- #1293
On the code side, we should
- get rid of
goto
statements - split processing into smaller functions
- Use smart-pointers
- Use more descriptive variable names
- Don't use global variables (that much)
- Don't use magic & hardcoded numbers; prefer named constants
Furthermore:
- Remove ability to use different backends for input and output (https://github.com/mumble-voip/mumble/issues/5408#issuecomment-1015248102)
- Consider completely shutting down audio processing while disconnected (keep TTS in mind though)
- Add support for stereo audio input (#5626)
Also a rather controversial design decision is that the Audio backend can be different for output and input. Is there actually any usecase for this? This does not only complicate the code but it also is unintuitive for the user, as you have to change the backend in the settings twice (unless using the Audio wizard). Is there anyone that actually uses different backends? Otherwise I would vote to remove this implementation detail, which would also allow to make other things easier such as registering Mumble as a single node (with input and output ports) in the Pipewire/Jack node graph, instead of registering two nodes with one having the input ports and the other the output ports.
I agree with @vimpostor 's idea. In fact, the separation of the backend for input and output is not very compatible with macOS, especially when it comes to the native echo cancellation of Mac...
The biggest barricade is (and has always been the case for audio-related challenges): our current audio system supports so many backends and so many OSs, which make refactoring a daunting task in general. I think perhaps someone in Mumble team should take the lead and print out a general roadmap and structure, so other volunteers may easily catch up and start to work on a very specific backend/OS.
The backend separation was extremely useful on Linux before PipeWire became a thing: it was not rare to see JACK as input and PulseAudio as output.
Right now I can only think of Windows with WASAPI + ASIO as a reason for the feature to exist, but as far as I know it's not an ideal setup anymore now that WASAPI itself can deliver low latency (related: #3503).
One extremely useful feature for me is the ability to use two distinct audio devices via WASAPI, one for input, the other for output, each with it's own bitrate/depth and have Mumble work flawlessly with it.
Relevant: https://github.com/mumble-voip/libcrossaudio