Add PulseAudio backend for SuperCollider on Linux
Purpose and Motivation
For a discussion of why we should have a PulseAudio backend for SuperCollider on Linux, see https://github.com/supercollider/rfcs/pull/11.
I would like to keep the discussion on the why separate from the implementation itself (the how). The idea is to use this PR to discuss the implementation itself, and use the RFCs for the reasons why it makes sense. But I am not sure that we need to be very strict about this, since that's a bit of a grey area.
It is important to understand that this is not meant to be a replacement for jack at all. This is just to give an option on systems where jack is difficult to set up. jack will continue to be the default.
This PR adds support for PulseAudio, both inputs and outputs. The device selection is done outside SC, as per PulseAudio philosophy, where applications output to a default sink, and get input from a default source, and the actual connections happen on the system's setup / PulseAudio control applications.
The changes are pretty self contained, with just one C++ file added, and then some changes to CMakeLists and READMEs as required.
The selection of frontend is done at build time, and by default the backend will continue to be jack. This will only try to use the PulseAudio backend if instructed explicitly to do so, using the AUDIOAPI cmake variable. So, for users that are currently happy with their setup, using jack, this change will not affect them at all and will be completely transparent.
On systems where it is difficult to use jack, the user / maintainer can set the cmake variable to pulseaudio, and then, and only then, it will use the PulseAudio backend.
As you can see the code is not too complex, and should be easy to maintain. Note that the expectation is that the PulseAudio backend will be used in simple-ish systems, so we may not want to support complex / exotic audio configurations, and encourage people to use jack for that.
Types of changes
- Documentation
- New feature
To-do list
My understanding is that a PortAudio backend for Linux scsynth exists, but it's experimental and at your own risk. I wouldn't say it's "supported" (which implies some obligation to keep it working in a wide range of installations -- we're not making any such promise presently).
My understanding is that a PortAudio backend for Linux scsynth exists, but it's experimental and at your own risk. I wouldn't say it's "supported" (which implies some obligation to keep it working in a wide range of installations -- we're not making any such promise presently).
Ok, thanks for clarifying, I'll keep it out of the readme update, as per my original idea.
I was surprised to see the inclusion of RtAudio for this feature. From what I understand, RtAudio is an alternative to PortAudio. Maybe not for Linux, but SC already makes use of PortAudio. Why should SC include two competing audio APIs? RtAudio being used only for an experimental feature isn't justification enough in my opinion.
I find the casual inclusion of RtAudio here problematic. It needs to be mentioned clearly in the PR description. It's an important implementation detail that can easily go unnoticed at first glance.
I was surprised to see the inclusion of RtAudio for this feature. From what I understand, RtAudio is an alternative to PortAudio. Maybe not for Linux, but SC already makes use of PortAudio. Why should SC include two competing audio APIs? RtAudio being used only for an experimental feature isn't justification enough in my opinion.
The problem is that PortAudio does NOT support PulseAudio (it does support Alsa), so it doesn't meet the requirement to work with PulseAudio natively. That's why I went the RtAudio way (after trying the pure PulseAudio API), because RtAudio does have a PulseAudio backend (it also has direct Alsa interface - but that's not what we need). Have a look here to see the supported backend on PortAudio, you'll see that PulseAudio is not there: http://portaudio.com/docs/v19-doxydocs/api_overview.html.
I find the casual inclusion of RtAudio here problematic. It needs to be mentioned clearly in the PR description. It's an important implementation detail that can easily go unnoticed at first glance.
I am more than happy to mention that in the PR more prominently. I can put it in the README_LINUX file. Would that be ok? I have just added the fact that this is using it to the RFC itself.
Thank you for the feedback,
just want to mention that after we discussed this RFC on the dev call i looked into other alternatives to RtAudio.
libsoundio appears to be abandoned (latest commit 14 August 2018), so i would not prefer it over rtaudio (latest commit 7 June 2020) just based on that fact alone.
i wasn't really able to find any other library that does this.
however, i also see that RtAudio's PulseAudio support is explicitly noted as not reliable by the maintainers: https://github.com/thestk/rtaudio/issues/262
The Pulse support in RtAudio has been fairly minimal. It was contributed by a user and has never been extensively investigated by myself or others to bring it up to what I might consider "full" support. The ALSA support, on the other hand, is (or at least was) robust and includes "realtime" functionality that was tested and verified.
is the main issue with using the PulseAudio API directly the asynchronicity in some callbacks? or were there more issues @llloret ? i'd consider whether we're just causing trouble for ourselves later on here by relying on another project's experimental work, when we might be able to make patches to our backend more easily and reliably ourselves.
in any case, i agree with @patrickdupuis that new dependencies need to be clearly advertised in PRs. thank you for updating that!
@llloret sorry, I'm not very familiar with all of this.
The problem is that PortAudio does NOT support PulseAudio (it does support Alsa), so it doesn't meet the requirement to work with PulseAudio natively.
I've seen it written online that PortAudio can work with PulseAudio. Maybe that's not actually the case and ALSA is being used?
Side note, apparently, much to my own surprise, I've tested this out in the past: https://github.com/supercollider/supercollider/issues/1944#issuecomment-301535063
however, i also see that RtAudio's PulseAudio support is explicitly noted as not reliable by the maintainers: thestk/rtaudio#262
Please have a look at the discussion in https://github.com/supercollider/rfcs/pull/11. @sonoro1234 and I give some pointers that show that the PulseAudio API is being actively maintained. As a more specific example, I see that the device detection code was enhanced around 3 months ago. So, there's clearly people working on that.
I will also note that during my limited testing so far, I have not found any stability or usability issues.
is the main issue with using the PulseAudio API directly the asynchronicity in some callbacks? or were there more issues @llloret ? i'd consider whether we're just causing trouble for ourselves later on here by relying on another project's experimental work, when we might be able to make patches to our backend more easily and reliably ourselves.
I haven't seen any issues in that area to be honest. I see a mutex protecting a critical area in the callback, but no other threads would be contesting that mutex during normal use (they are used at stream start, stop and abort), so I don't see that as an issue.
I would not try to look too far on this, and this is the reason: today, the RtAudio API looks like the best approach to implement a PulseAudio backend. It does everything that we need to, and it seems to be maintained well enough. If in the future, we see that we hit a limitation, then nothing stops us from implementing our own raw PulseAudio API. But until that moment, I'd rather rely on external work that seems to have gone a decent amount of testing, that seems to be in use in other projects around the world, and that comes from a reputable institution (I know all these things do not guarantee the work quality - but it gives a good level of confidence). This RtAudio package is available in almost every Linux distribution, which gives some confidence too. So while I appreciate that at some point a maintainer said that this was considered to be experimental, I believe that this is mature to be used here. But, as usual, let's make sure that this is the case through testing. Otherwise, we are just doing premature pessimization, which is not good.
I would also like to point out, that TODAY we have a PulseAudio backend implemented using RtAudio, that has already gone through some level of testing to prove that it works in this Pull Request. An alternative PulseAudio backend NOT using RtAudio does not exist at the moment, although I did som PoC work on that (and the code did not look as readable / maintainable).
I encourage people to give this PR a try and give feedback on their findings.
Of course, all this needs more testing, but as of today, I have not seen any issues that make me think that RtAudio is the wrong approach.
So while I appreciate that at some point a maintainer said that this was considered to be experimental, I believe that this is mature to be used here
to be honest, i don't see much that supports the notion this has progressed to a 'mature' state since 2018, but i do see that they would quickly and gladly accept PRs, which is enough for me. i am OK using RtAudio here, as i said in the RFC. ^^
I would also like to point out, that TODAY we have a PulseAudio backend implemented using RtAudio, that has already gone through some level of testing to prove that it works in this Pull Request. An alternative PulseAudio backend NOT using RtAudio does not exist at the moment, although I did som PoC work on that (and the code did not look as readable / maintainable).
saying that we should use your implementation just because it is the only implementation, is bad form as a PR author in my opinion. you don't gain anything by forcing others to choose between supporting an implementation they don't like and missing out on a feature entirely.
do you have the raw PulseAudio API PoC in a branch anywhere? i would be interested to see it.
also, you keep mentioning testing; can you please provide more information on what exactly you tested? did you compare this implementation's latency limits with this backend compare to the jack backend? that's the main thing to know IIUC. i could also compare that on my Arch Linux machine and possibly my RPi 4.
saying that we should use your implementation just because it is the only implementation, is bad form as a PR author in my opinion.
I am not saying that we should use my implementation, sorry if it came accross that way. What I meant is that the current implementation is something that we have today and we can start testing today, and try to find flaws in it today. If after trying it, people do not like it, fair enough. I am not saying that my implementaiton is better, just that my implementation exists...
also, you keep mentioning testing; can you please provide more information on what exactly you tested?
I have tested that this works with acceptable (subjective) latency on an old laptop with Ubuntu 18.04, and Linux Mint 19.3, and Linux Mint 20, and a Raspberry Pi 4. I have tested that both inputs and outputs work. I'm confident that this is enough the vast majority of the use case that we are targetting PulseAudio for - but of course, this needs further testing. I do not want people to take my word for it, this is gighly subjective.
did you compare this implementation's latency limits with this backend compare to the jack backend?
I did not test this explicitly, because as I have mentioned before, the purpose of this work is not to replace jack. From what I have seen jack will always have better latency, because that's what it was designed for. PulseAudio was designed for ease of use and integration of applications, while maintaining good enough latency for most users.
do you have the raw PulseAudio API PoC in a branch anywhere? i would be interested to see it.
The work here did not finish, since I left it after implementing the output, and tried RtAudio, but I am happy to share it on my repo. I'll upload it later. You'll see that the code is not as straightforward to understand. Also, please keep in mind that it is unfinished work and is not up to my standard quality ;)
One other suggestion: It would be nice to implement ServerOptions *devices and allow device selection in the scsynth command line. I did see the suggestion that users should do it in the PulseAudio control panel, but if it's not too much trouble, it would enhance compatibility with macOS and Windows. (If it is too much trouble, then that could be postponed or dropped.)
One other suggestion: It would be nice to implement
ServerOptions *devicesand allow device selection in the scsynth command line. I did see the suggestion that users should do it in the PulseAudio control panel, but if it's not too much trouble, it would enhance compatibility with macOS and Windows. (If it is too much trouble, then that could be postponed or dropped.)
I totally agree with this, and brings me to an important discussion point: the thing is that the version of RtAudio that comes prepackaged in Linux Mint 19 and 20 (and I guess in a lot of other distributions) is version 5.0, which does not include the latest changes that were made to RtAudio around 3 months ago to allow for proper device selection. The packaged version just selects the default device and assigns 2 input and 2 outputs channels. The lates RtAudio available version, 5.1, does allow selection of audio devices (but I have not tried that version yet).
So, we have 2 choices here: either rely on the "official" packaged version on each linux distribution OR package our own RtAudio (perhaps as a submodule, as we are doing with things like Link?). I believe the 2nd is probably best, since that way we can choose what version of RtAudio to use, and move to more recent versions quicker. It would also mean that everyone would be using the same version which is a plus for figuring out what's going on when things do not work as expected.
What are people's thoughts?
I believe the 2nd is probably best
So do I
i dont see that one feature as a strong enough motivation for packaging this outselves. strong no from me.
i dont see that one feature as a strong enough motivation for packaging this outselves. strong no from me.
This means that we might get different behaviour depending on the rtaudio library that each distribution packages. Keep in mind that the rtaudio library is just one cpp file and a header file, so having this as dependency on our own repo as a submodule would be close to zero work. And then we have the flexibility to package the rtaudio version that we want... For example, someone requested the possibility to select the audio device on SC itself, which is something that cannot be done with the rtaudio version that is being packaged by most maintainers today, but the last rtaudio version already has that capability.
What are you concerned about packaging this ourselves? It's just having the submodule, and pointing to the proper file on the CMake files, which is straightforward. I might be missing something, though...
My concern by not using our own RtAudio version, is that we may need to potentially test in a high number of distributions to make sure that the RtAudio packages are consistent. Otherwise, you might get different behaviours depending on what rtaudio is packaged. For a more complex library, it would be worth doing that, but RtAudio is very straightforward to include, as I said above.
Use it for Windows as well, then PA can be swapped out in exchange. And I think rtAudio ships with ASIO files included, that would take out another dependency that is actually quite hairy and has a hidden license issue ;-) Who knows, the boot problems and jack incompatibility on Windows might go away as well...
Use it for Windows as well, then PA can be swapped out in exchange.
But not in this PR (could be in the future)
Tested in ubuntu studio 19.10. Works with jack started by routing pulseaudio and with jack stopped letting select audio card with system audio mixer.
As @jamshark70 said only CPU load is missing.
It could be taken from SC_CoreAudio.cpp:2125 but using getTime from SC_Time.hpp:36 (std::chrono::system_clock::now();)
Another thing missing (may be another PR) would be pulseaudio for supernova (I guess it can be done from portaudio_backend.hpp)
What are you concerned about packaging this ourselves? It's just having the submodule, and pointing to the proper file on the CMake files, which is straightforward. I might be missing something, though...
i've responded on the RFC. quick reminder to everyone here to keep the discussion focused and on-topic please:
The idea is to use this PR to discuss the implementation itself, and use the RFCs for the reasons why it makes sense.
CPU load tested: Average CPU similar to Jack. PeakCpu goes more than 100% sometimes (but without sound glitches)
I've tested this on Ubuntu 20.04. Server boots and makes sound 👍
I've found the following issue: if I force stop a synth (with ctrl-.), the last audio buffer seems to keep playing in a loop; this doesn't happen with JACK backend. Not sure how that relates to the backend code...
I see that it's rtAudio's fault that it only reports a single device currently with only 2 ins and 2 outs, even if connected soundcard has more IO. IMO the device selection should still be part of our implementation (both for server device switch, as well as language's ServerOptions.devices), so it'd be useable on systems with updated rtAudio library. But we can also add that functionality later I guess.
(sorry to interrupt)
if I force stop a synth (with ctrl-.), the last audio buffer seems to keep playing in a loop; this doesn't happen with JACK backend.
I will add this here, this happens with jack too. ctrl+. doesn't stop the audio instantly, the last buffer keeps playing.
I also have ubuntu based distro installed
this happens with jack too
That's quite strange -- I've never seen that (10 years using SC in Linux). Maybe log that as a separate issue. I'd have to suspect something in the system configuration but I can't guess what it is.
turns out it is due to FoxDot, it uses a buffer to send the commands to sc
thanks @tshrpl @jamshark70
Can anyone else confirm an issue (in this PR and PulseAudio) with the last audio buffer playing after stopping the synths with cmd+.? Or is this only happening on my system?
thanks @tshrpl @jamshark70 Can anyone else confirm an issue (in this PR and PulseAudio) with the last audio buffer playing after stopping the synths with
cmd+.? Or is this only happening on my system?
Interesting, I think I can reproduce it on my test system. I'll have a look. Thanks a lot for reporting it!
Can anyone else confirm an issue (in this PR and PulseAudio) with the last audio buffer playing after stopping the synths with cmd+.?
Confirmed -- actually it's worse than that. All three of these fail with PA, but fine with JACK:
a = { SinOsc.ar(440, 0, 0.1).dup }.play;
CmdPeriod.run;
a = { SinOsc.ar(440, 0, 0.1).dup }.play;
a.free;
a = { SinOsc.ar(440, 0, 0.1).dup }.play;
a.release;
So, in my build of it (just pulled), there is no way to stop any synth safely.
~~There's if (tch[k] == bufCounter) (line 172) so I guess tch[k] is still being updated even after the Out ugen is destroyed?~~
Also, server starts with 48 kHz sample rate, but the soundcard is running at 44.1 kHz...?
Ohhh... I see it now. (What's a mystery to me is -- it should have failed before when I was testing, but it didn't.)
From line 172:
If the bus has been touched, copy the full control block of audio using a for...
if (tch[k] == bufCounter) {
const float* src = outBuses + k * bufFrames;
for (int n = 0; n < bufFrames; n++) {
*dst = *src;
src++;
dst += m_outputChannelCount;
}
But if it wasn't touched, zero out only a single sample -- here's the bug -- this should be a for as well.
} else {
*dst = 0;
dst += m_outputChannelCount;
}
}
Also, server starts with 48 kHz sample rate, but the soundcard is running at 44.1 kHz...?
This might be another issue that was fixed in rtAudio 5.1, see here. As it was mentioned earlier, before that commit rtAudio only reported a single default device with 2 ins/outs, and also seems like hardcoded 48k default samplerate.
Also, server starts with 48 kHz sample rate, but the soundcard is running at 44.1 kHz...?
This might be another issue that was fixed in rtAudio 5.1, see here. As it was mentioned earlier, before that commit rtAudio only reported a single default device with 2 ins/outs, and also seems like hardcoded 48k default samplerate.
Yes, that's right. If the user does not select a preferred samplerate, the current rtaudio version will always select 48kHz. The most recent version will select the one advertised by the card.
Sorry, pressed the wrong button, and closed the PR... I have reopened it again. The Close and comment button should not be that close to the comment one ;).
@llloret just a side note - could you please fix the formatting on this PR?
Sure, yes, I did that on the first commits, but forgot on the latest ones. I'll run the formatter later.
Just a general question: who is supposed to mark the conversations as resolved? Should I do it when I implement the fixes or answer the questions, or should the person that raised the comment do it? Different teams have different workflows for this, so I prefer to ask before marking anything as resolved.
Just a general question: who is supposed to mark the conversations as resolved? Should I do it when I implement the fixes or answer the questions, or should the person that raised the comment do it? Different teams have different workflows for this, so I prefer to ask before marking anything as resolved.
~~I think this is a bit different between an RFC and a PR. For a PR typically at least one reviewer "Approves" the PR and another then merges. Sometimes reviewers leave a comment that they'd like additionally someone else to also look over/approve the changes before being merged.~~
EDIT Sorry I think I misread the question. I don't think we have strict rules for resolving the conversation, I typically let the reviewer to mark it as resolved once they feel like it, but I think I've also seen PRs being approved without marking conversations as resolved (or marking them as such at the time of approving the PR), and occasionally I've seen the contributor mark the conversation as resolved if things were clearly addressed.
EDIT Sorry I think I misread the question. I don't think we have strict rules for resolving the conversation, I typically let the reviewer to mark it as resolved once they feel like it, but I think I've also seen PRs being approved without marking conversations as resolved (or marking them as such at the time of approving the PR), and occasionally I've seen the contributor mark the conversation as resolved if things were clearly addressed.
Ok, I think I'll apply common sense, and close the straightforward ones myself, like for example, the one about the formatting, and let the reviewer close the more complex ones.
For reference, Raspberry Pi OS is moving to pulse audio so SuperCollider (and by extension Sonic Pi) will no longer work out of the box
https://www.raspberrypi.org/blog/new-raspberry-pi-os-release-december-2020/
@samaaron not sure I understand. Didn't SuperCollider require JACK on RPi anyway? AFAIU nowadays JACK is able to coexist with PulseAudio.
Oh, interesting - I'm yet to update my Pi. I just assumed it would be the case. Perhaps it'll just be a case of hand holding jack to talk to pulse audio with the right flags?
also worth noting, portaudio may be adding pulseaudio support soon -- https://github.com/PortAudio/portaudio/pull/336
@llloret i haven't seen an update from you in a while. are you still planning to work on this? i'm happy to help and i'm sure others would also be willing if you still want to.
also worth noting, portaudio may be adding pulseaudio support soon -- PortAudio/portaudio#336
I also hope that I have time to work it soon in. If someone is interested on topic testing and reviewing code would be great thing to-do at this point so it would be 100% bullet proof when it gets in
Hi,
Yes, still planning to keep working on this. Pretty busy at work lately, but that's the idea. Luis
@llloret - with changes that are likely coming to PortAudio, this may no longer be needed. I'll add @Spacechild1 as a reviewer since he is more up to date on PortAudio work. We may not need this (apologies for the delay in general) See: https://github.com/PortAudio/portaudio/pull/336
@llloret - with changes that are likely coming to PortAudio, this may no longer be needed. I'll add @Spacechild1 as a reviewer since he is more up to date on PortAudio work. We may not need this (apologies for the delay in general) See: PortAudio/portaudio#336
That would be great. It would simplify the PulseAudio use model compared to adding a full backend (and all its associated baggage), so it would be fantastic indeed.
I will leave this open for now, and once we can do this through PortAudio, I will close it.