rfcs Add PulseAudio backend to SuperCollider

Summary: On some Linux systems Jack is either not available or difficult to configure. Users and distribution maintainers should have the option to use a PulseAudio backend instead. This RFC is about adding a new PulseAudio backend to SuperCollider. Jack will remain the default choice since it has technical advantages.

Jul 03 '20 21:07 llloret

At some point I was interested in the option to use SC on Linux without JACK. I haven't tried but I was wondering whether building with PortAudio would allow to achieve that? (I'm not a primarily a Linux user and I haven't followed any possible PortAudio - PulseAudio incompatibilities).

Jul 03 '20 22:07 dyfer

I haven't tried but I was wondering whether building with PortAudio would allow to achieve that?

If I understand correctly, the currently experimental PortAudio backend would connect directly to ALSA. For the casual user (e.g. somebody who is interested in live coding but who is not interested in esoteric Linux configuration, who installed TidalCycles or FoxDot, and then found out that they have to do extra stuff, potentially complex stuff, to get any sound at all), ALSA configuration is not simpler than JACK configuration. So PA does (potentially) have the benefit of a shorter sequence between installation and making noise.

I think a PA backend would also require the server devices primitive in sclang. Currently devices is not implemented in Linux because it's irrelevant for JACK. But PA does maintain a list of input and output devices, so a PA-enabled SC build should maintain cross-platform compatibility in this area.

The factors as I see them are the maintenance burden (which I think will be higher than the estimate in this RFC but probably not terribly high) vs the support cost of the bug reports filed by Tidal/FoxDot/Sonic Pi users who can't get sound right away. The latter would be very very nice to resolve and good public relations for the SC community, so I'd lean in favor of having the option of a SC/PA build that might be easier to bundle with various live coding environments.

Jul 03 '20 23:07 jamshark70

If I understand correctly, the currently experimental PortAudio backend would connect directly to ALSA.

As I've dug a little into this, it seems PortAudio -> PulseAudio is possible through the ALSA compatibility layer (albeit not an ideal case), based on this report.

I think a PA backend would also require the server devices primitive in sclang. Currently devices is not implemented in Linux because it's irrelevant for JACK.

I think this should already be implemented - I've added PortAudio device listing to be used on Windows (since 3.11). This code is enabled based on backend choice, and not on the platform itself, so if PortAudio backend is chosen, sclang should also use PA for listing devices.

Looking up "portaudio pulseadio" in a search engine brings out https://github.com/bkgood/portaudio-pulseaudio This repo seems outdated, I wonder if there are more recent attempts to include PulseAudio host API in PortAudio?

Jul 03 '20 23:07 dyfer

The analyses of the problem isn't very good I think. Tell me on which Linux distribution JACK is not available please.

Configuration isn't that hard as well, with the good resources.

Ardour uses ALSA as default now, but it has it's focus on plugins, not external programs. Pulseaudio is a option now too maybe, for cheap (bluetooth?) devices? But afaik that's not for serious work as what you would expect with SuperCollider. But at least you can ask the Ardour team on advise.

Pipewire seems to want to bridge the different audio sytems on Linux: https://pipewire.org/

I'm a happy JACK user though.

Jul 04 '20 08:07 grammoboy2

Having portaudio on Linux or Jack on windows is mostly done, only a few tweaks are needed for that.

Another path to consider is to add an RtAudio backend which works on Linux, Window and Macintosh OS X. https://github.com/thestk/rtaudio Description: A set of C++ classes that provide a common API for realtime audio input/output across Linux (native ALSA, JACK, PulseAudio and OSS), Macintosh OS X (CoreAudio and JACK), and Windows (DirectSound, ASIO, and WASAPI) operating systems.

Jul 04 '20 09:07 sonoro1234

Thank you for the feedback. Just some comments below.

Another path to consider is to add an RtAudio backend which works on Linux, Window and Macintosh OS X. https://github.com/thestk/rtaudio

I had considered going the rtaudio route, but I thought a direct native approach would have more perfrmance. For example, the rtaudio pulseaudio is using the "simple" API on pulseaudio, not the full async one.

Having portaudio on Linux or Jack on windows is mostly done, only a few tweaks are needed for that. Can you elaborate on this? I mean, I have a basic proof of concept of PulseAudio working now, and it was not rocket science, but it was not trivial either, in the sense that I had to research the PulseAudio API. Was there a simpler way that I missed?

Jul 04 '20 10:07 llloret

@jamshark70 , thank you for your feedback. Yes, I take it there might be some more work to do on other fronts as well, but in general this seems quite well contained. I am happy to make the necessary changes in sclang or other places, with a bit of guidance from someone that knows those areas.

I think you hit the nail in the head with how much this would help other communities that are using SuperCollider as the backend. These people are normally not SC experts, nor audio experts, nor linux experts, so configuring things that do not work out of the box is a big barrier of entry for them. That's why I believe such a PulseAudio backend will help spread SC in those places. And according to the basic proof of concept I have, supporting a basic configuration would not be that difficult.

Jul 04 '20 10:07 llloret

@grammoboy2 , thanks for your comment.

The thing is that there are Linux distributions where using Jack involves making changes to the system, it does not work out of the box. Now, for someone tech-savvy, those things probably amount to nothing, but for the majority of people using those systems, it can be something next to impossible to do.

The idea is not to remove Jack and use PulseAudio, this is about giving the chance to use PulseAudio on the systems where Jack is hard to configure and use. So for people like you, which are happy with Jack, there will be no change at all. To reiterate, this just to lower the barrier of entry to SuperCollider (and SW that uses SC as a backend), on systems where Jack is hard.

Jul 04 '20 10:07 llloret

Just an example: On Ubuntu, the default audio backend is PulseAudio, and while you can use Jack, from what I know, configuring it is not straightforward for someone that is not tech-oriented. Keep in mind that a lot of people using Ubuntu and related systems are not experts, they are just everyday users that want to get some work (or music ;)) done. If we could give them a user experience where they could just start using SuperCollider, Tidal, Sonic Pi, whatever, without having to go hunting around forums for how to use Jack in there, I believe the SuperCollider adoption would be quite higher.

Jul 04 '20 10:07 llloret

I had considered going the rtaudio route, but I thought a direct native approach would have more perfrmance. For example, the rtaudio pulseaudio is using the "simple" API on pulseaudio, not the full async one.

The best thing with RtAudio is that it is very alive and responsive repo: https://github.com/thestk/rtaudio/pull/213 and other issues and additions. And given that portaudio is not an alive project it could break in the near future so that RtAudio would be a good replacement that unifies the three major OSes.

Perhaps a longer but very productive path (for all the opensource community) would be: apply your async PulseAudio knowledge to extend RtAudio and then make an RtAudio backend to SC?

Can you elaborate on this? I mean, I have a basic proof of concept of PulseAudio working now, and it was not rocket science, but it was not trivial either, in the sense that I had to research the PulseAudio API. Was there a simpler way that I missed?

I can only elaborate for Jack on windows : https://github.com/supercollider/supercollider/pull/3692 For portaudio on linux I only meant portaudio on Linux, may be not PulseAudio via portaudio.

Jul 04 '20 10:07 sonoro1234

@sonoro1234, I think what I'll do is finish my simple proof of concept using pure pulseaudio and then try using it through rtaudio, and then see which one looks better

Jul 05 '20 12:07 llloret

Hi, so I think I have got this in a mature enough state in my branch, and the code is ready for a Pull Request. I have asked around in the dev channel, but mention it here too. Is it ok to do the proper Pull request in the SuperCollider repository to start having a proper look at it? @sonoro1234 , you might be interested to know that in the end I went the rtaudio route, because I liked the API so much more, and it is much more maintainable than the raw pulseaudio C based API.

Jul 09 '20 18:07 llloret

I think making SuperCollider more widely available is a good thing, but I worry that PulseAudio users could mistake the poor performance of PulseAudio for SuperCollider problems. Once you get beyond the simple kinds of livecoding use cases then users will likely run into timing, latency and dropout issues. And while you could do a live gig using Pulse Audio, it's probably not the greatest idea.

Maybe some documentation could be added both in the downloads/install section of the website, and in the SuperCollider documentation that makes it clear that if the user is experiencing performance issues, and they're using the JACK version, then they should switch to the JACK version.

Jul 09 '20 19:07 cianoc

you might be interested to know that in the end I went the rtaudio route, because I liked the API so much more, and it is much more maintainable than the raw pulseaudio C based API.

Thats great!!. So if I correctly understand SC has now a RtAudio interface (not only for PulseAudio). Did you implemented the async PulseAudio API for RtAudio or just used the already existent one?

Jul 10 '20 08:07 sonoro1234

I just used the already existing one. The limitations of the RtAudio implementation will not affect the usage on SuperCollider, so I am happy about that.

Just a clarification to your comment: the RtAudio instantiation is limited to PulseAudio in the implementation (I mean that it is instantiating a PulseAudio backend). For now, I believe that's the best way, to make sure that the changes are self-contained and do not affect other architectures. But yes, in the future it should be very simple to use this same RtAudio implementation for Windows and Mac, and it will use the native backends in there. That is not mandatory, though, and if people are happy with the current implementations in those architectures, then there is no need to do anything, and the PulseAudio feature will not interfere at all with them.

Jul 10 '20 08:07 llloret

But yes, in the future it should be very simple to use this same RtAudio implementation for Windows and Mac

Or may be also in Linux with OSS, ALSA or Jack.

Jul 10 '20 08:07 sonoro1234

Yes, perhaps. But for now, I prefer not to touch the jack backend at all, since that will require much more thought and consensus within the group.

I want to keep this specific RFC very focused on providing an alternative pulseaudio backend on systems where jack is hard to set up. Not more, not less.

That it might perhaps be used in the future to consolidate work (and ease maintenance), is a bonus, but I don't want to center the discussion around that.

Jul 10 '20 09:07 llloret

That it might perhaps be used in the future to consolidate work (and ease maintenance), is a bonus, but I don't want to center the discussion around that.

Yes, of course, in the future

Jul 10 '20 19:07 sonoro1234

I support making PulseAudio usable for people who can't install and configure JACK on their system. Assuming of course that performance is reasonably acceptable, that we clearly communicate PulseAudio as an experimental feature, and that we do our utmost to help users make use of JACK by way of documentation and community support.

Jul 12 '20 20:07 patrickdupuis

Hey @llloret , just read the RFC and this discussion, i'm definitely in support of this. I have a few questions and concerns for you and for some other people in this thread.

First, i think this should be documented in the RFC text: on which systems can Jack not be obtained, and on which systems is it difficult to configure (and why?). you've answered that here but it's a big question mark for me when i first read the RFC.

I have concerns about using rtaudio for this, after seeing the comment in this issue:

The Pulse support in RtAudio has been fairly minimal. It was contributed by a user and has never been extensively investigated by myself or others to bring it up to what I might consider "full" support. The ALSA support, on the other hand, is (or at least was) robust and includes "realtime" functionality that was tested and verified.

The PulseAudio backend uses mutexes in the audio generation callback, which I've heard is unacceptable in real-time audio. It would probably take days for me to understand RtAudio, understand why the mutexes are needed, and remove them. So I'll focus on the latency for now. OpenMPT's PulseAudio support doesn't use mutexes, but tends to lock up the entire program.

even if rtaudio is responsive, i'm not sure i trust their implementation at the moment. i think i'd recommend to use PulseAudio's API directly. to quote from my comment on the PR:

libsoundio appears to be abandoned (latest commit 14 August 2018), so i would not prefer it over rtaudio (latest commit 7 June 2020) just based on that fact alone.

i wasn't really able to find any other library that [wraps pulseaudio].

is the main issue with using the PulseAudio API directly the asynchronicity in some callbacks? or were there more issues @llloret ? i'd consider whether we're just causing trouble for ourselves later on here by relying on another project's experimental work, when we might be able to make patches to our backend more easily and reliably ourselves.

Jul 12 '20 23:07 mossheim

Hey @llloret , just read the RFC and this discussion, i'm definitely in support of this. I have a few questions and concerns for you and for some other people in this thread.

First, i think this should be documented in the RFC text: on which systems can Jack not be obtained, and on which systems is it difficult to configure (and why?). you've answered that here but it's a big question mark for me when i first read the RFC.

I have concerns about using rtaudio for this, after seeing the comment in this issue:

The Pulse support in RtAudio has been fairly minimal. It was contributed by a user and has never been extensively investigated by myself or others to bring it up to what I might consider "full" support. The ALSA support, on the other hand, is (or at least was) robust and includes "realtime" functionality that was tested and verified.

The PulseAudio backend uses mutexes in the audio generation callback, which I've heard is unacceptable in real-time audio. It would probably take days for me to understand RtAudio, understand why the mutexes are needed, and remove them. So I'll focus on the latency for now. OpenMPT's PulseAudio support doesn't use mutexes, but tends to lock up the entire program.

even if rtaudio is responsive, i'm not sure i trust their implementation at the moment. i think i'd recommend to use PulseAudio's API directly. to quote from my comment on the PR:

libsoundio appears to be abandoned (latest commit 14 August 2018), so i would not prefer it over rtaudio (latest commit 7 June 2020) just based on that fact alone. i wasn't really able to find any other library that [wraps pulseaudio]. is the main issue with using the PulseAudio API directly the asynchronicity in some callbacks? or were there more issues @llloret ? i'd consider whether we're just causing trouble for ourselves later on here by relying on another project's experimental work, when we might be able to make patches to our backend more easily and reliably ourselves.

@brianlheim, thanks for the feedback. Let me try to address your concerns. In the testing I have done so far, I have not found any issues regarding stability, dropouts, etc. Yes, there is a mutex in the audio callback, but the only contention points are in calls related to starting, stopping or aborting the stream, which is something that we are not doing while streaming, so I am not too worried about that. The audio callback will be able to fetch the mutext instantly every time it needs to. If we went the direct pulseaudio way, it is very possible we would have to implement the same mutexes to make sure that it does not break while you are starting or stopping the stream.

On things about performance, I tend to believe that the proof is trying and testing. I don't think that mutex there will hurt at all.

Also, my initial concern about RtAudio using the "simple" pulseaudio API turned out to be unfounded. That API is enough to deal with one stream (multi-channel in and out) of audio per application, which is all that we need. I was confused by the meaning of one stream, thinking it was just in OR out, but it turns out it can do both.

What I would suggest is people try the current implementation and see how well it is working for them.

Jul 13 '20 08:07 llloret

I support making PulseAudio usable for people who can't install and configure JACK on their system. Assuming of course that performance is reasonably acceptable, that we clearly communicate PulseAudio as an experimental feature, and that we do our utmost to help users make use of JACK by way of documentation and community support.

That's brilliant. Really glad to hear that.

Jul 13 '20 10:07 llloret

Do we need to promise reasonably low latency? If the use case is to make it easy to get started, that could go along with higher latency, and document that low latency use cases should use JACK. I guess some user sometime might complain that PA is too slow, but the answer is still: when you outgrow PA, time to move to JACK. ("I want to use SC as a live guitar rig and I don't want to set up JACK" -- we could reasonably dismiss that case.)

IMO if a PA backend is stable, that's enough. Performance is secondary here.

Brian's got a good point that, if RtAudio's PA module isn't well maintained, it could bite us later.

Jul 13 '20 12:07 jamshark70

Brian's got a good point that, if RtAudio's PA module isn't well maintained, it could bite us later.

RtAudio is mantained and developers are quite involved. see https://github.com/thestk/rtaudio/issues/240 for example

If the alternative is to develop our own PulseAudio alternative, I think this could bit us later also. If taking the RtAudio route anything to be fixed in PulseAudio-RtAudio could be fixed in RtAudio repo either by RtAudio developers or by us, avoiding duplicate efforts. (leaving apart the future advantages of having for free a Portaudio substitute)

Besides the work is already done https://github.com/llloret/supercollider/tree/pulseaudio-backend

Jul 13 '20 13:07 sonoro1234

RtAudio is mantained and developers are quite involved. see thestk/rtaudio#240 for example

The issue is not RtAudio's maintenance; it's the PulseAudio module's level of maintenance. "The Pulse support in RtAudio has been fairly minimal. It was contributed by a user and has never been extensively investigated by myself or others to bring it up to what I might consider 'full' support."

Jul 13 '20 13:07 jamshark70

The issue is not RtAudio's maintenance; it's the PulseAudio module's level of maintenance

thestk/rtaudio#240 was an example of PulseAudio module maintenance.

Jul 13 '20 14:07 sonoro1234

libsoundio appears to be abandoned (latest commit 14 August 2018), so i would not prefer it over rtaudio (latest commit 7 June 2020) just based on that fact alone.

i wasn't really able to find any other library that [wraps pulseaudio].

Could be libao but I dont have experience with it (contrary to RtAudio that I have used for a long time in Lua and LuaJIT bindings for it)

Jul 13 '20 14:07 sonoro1234

Do we need to promise reasonably low latency? If the use case is to make it easy to get started, that could go along with higher latency, and document that low latency use cases should use JACK. I guess some user sometime might complain that PA is too slow, but the answer is still: when you outgrow PA, time to move to JACK. ("I want to use SC as a live guitar rig and I don't want to set up JACK" -- we could reasonably dismiss that case.)

IMO if a PA backend is stable, that's enough. Performance is secondary here.

Exactly my thoughts. This scenario is more about usability than performance (although performance is important too). Users that need the absolutely lowest performance should research how to do this, and will probably be happy to do that. This is about allowing to simple use of SuperCollider to users that are not prepared to do that on systems where it is complex. For advanced use cases, like non-standard card set ups, low-latency, ..., cases people will be encouraged to use jack.

Brian's got a good point that, if RtAudio's PA module isn't well maintained, it could bite us later.

Of course, there might be issues in the future, but the RtAudio code dealing with PulseAudio, seems quite simple and maintainable. And so far, I have not seen anything that makes me think that it might be an issue in the short / medium term. The testing I have done so far looks good, and I haven't seen any stabily issues so far. But I would like to see what other people are seeing in this respect. The PR should work, so I encourage people to give it a try and let me know how it goes.

Jul 13 '20 14:07 llloret

thestk/rtaudio#240 was an example of PulseAudio module maintenance.

Ah, so now I've been caught not actually clicking through the link... fair enough.

Jul 13 '20 14:07 jamshark70

The issue is not RtAudio's maintenance; it's the PulseAudio module's level of maintenance.

Apart from what @sonoro1234 already said, the PulseAudio parts seems to be undergoing active maintenance. For example the device selection code has been improved recently. As an example, in the repo I can see that there is already some proper device selection code, whereas the older version only has a PulseAudio default device (this explains why on my testing I can only see one sort of "PulseAudio" device). As I explain below, I do not think this device selection is an issue.

I strongly believe that RtAudio is a good choice to support PulseAudio in SuperCollider, and from what I have seen so far it meets our requirements (i.e. enough performace, being easy to configure and simple code). Perhaps the initial implementation with the current version will lack some things like device selection from within SC itself (it can be done on the PulseAudio server GUIs - I've already proved that in our initial testing). But on this, I still think that on PulseAudio, the user would normally configure it using the PulseAudio server itself and not SC, since that is the PulseAudio philosophy, to have applications output to a default, and configure it at the system level. BTW, this allows the user changing the input and output devices on the fly, wothout having to restart the SC server.

It also has the added advantage (as @sonoro1234 was mentioning too), that this way we will not need to maintain the PulseAudio API ourselves, but rely on the enhancements made by RtAudio. BTW the RtAudio API is much simpler to use than the raw PulseAudio API; I started the PoC using the raw PulseAudio API, and it was not as strightforward as the RtAudio. So I think that the API abstraction that RtAudio brings is very good.

Jul 13 '20 14:07 llloret

rfcs rfcs copied to clipboard

Add PulseAudio backend to SuperCollider

rfcs
rfcs copied to clipboard