sonobus icon indicating copy to clipboard operation
sonobus copied to clipboard

JUCE 6.0.8 issues in SonoBus 1.4.7 and later

Open kdoren opened this issue 3 years ago • 28 comments

@essej I am testing on Raspberry Pi but I suspect this issue may affect other linux platforms.

I am seeing a bug or at least a degradation in performance between SonoBus 1.4.6 and 1.4.7 and later. For same jack audio settings, 1.4.7 drops packets where 1.4.6 does not. The problem is not subtle, even using long periods in jack audio does not help.

This is due to the change from JUCE 6.0.8 in SonoBus 1.4.6 to JUCE 6.1.2 in SonoBus 1.4.7 and later.

I built SonoBus 1.4.9 against JUCE 6.0.8 from SonoBus 1.4.6, (by copying deps/juce) and good audio performance was restored. Everything else was the same.

Current JUCE is 6.1.5. There were some issues related to process priorities that were fixed in 6.1.3, but they seem related to openGL display performance, so I don't expect it to be the answer.

Still, I would suggest updating SonoBus to current JUCE to see if it fixes problems. I can't build against it due to the custom changes to JUCE used by SonoBus.

kdoren avatar Feb 01 '22 19:02 kdoren

UGH! Do you have some tips on reproducing the performance issues? Have you only seen it (or tested it) on RPi hardware, if so which ones? I could merge from the latest juce tip, but I somehow think it would be a crapshoot if it actually fixed it. Considering how slow compiles are on RPi, I don't look forward to doing a binary search to figure out when/what got introduced that causes it. Jump on Slack for some interactive chat about it, if you have a chance.

essej avatar Feb 01 '22 20:02 essej

@essej If you make a branch with updated JUCE, I can build and test against it.

kdoren avatar Feb 01 '22 21:02 kdoren

@essej the problem commit is this one:

Thread::setPriority() will no longer set a realtime scheduling policy for all threads with non-zero priorities on POSIX systems.

https://github.com/essej/JUCE/commit/802f33b0e865011aca07058fe5644cae27a46b8f#diff-eb35e9b62a33e679b1feab7ee3377e324e12d5ace9a943bc86d259aa65fbb4ce

If I revert that change it works as before. It also looks like I can change thread after launch with chrt with similar effect (though not so easy to identify which threads to change). I think there are threads which should be prioritized but no longer are.

kdoren avatar Feb 02 '22 19:02 kdoren

@essej I can work around this problem for now by patching to revert the threading commit that is causing trouble:

https://github.com/kdoren/jambox-debs/blob/main/sonobus/debian/patches/0001-revert_juce_priority_change.patch

kdoren avatar Feb 04 '22 15:02 kdoren

I'm reading from that JUCE BREAKING-CHANGES.txt rationale:

By default, new Thread instances have a priority of 5. Previously, non-zero priorities corresponded to realtime scheduling policies, meaning that new Threads would use the realtime scheduling policy unless they explicitly requested a priority of 0. However, most threads do not and should not require realtime scheduling. Setting a realtime policy on all newly-created threads may degrade performance, as multiple realtime threads will end up fighting for limited resources.

But it seems to me if the sonobus thread is specially an audio processing thread then maybe we should just do what they suggest:

For threads that require a realtime policy on POSIX systems, request a priority of 8 or higher by calling Thread::setPriority() or Thread::setCurrentThreadPriority().

I'm trying to search the sonobus repository for wherever the priority is getting set, and so far I've noticed it here:

https://github.com/sonosaurus/sonobus/blob/872f911e85ff9c113ddbcdad1077edc6bbf6420f/Source/SonobusPluginProcessor.cpp#L922-L924

And it seems (9 >= 8) so that seems like it should be getting realtime. Maybe is there another thread sonobus needs to explicitly set to have a priority >= 8?

I'd have to look more but maybe is it jack audio processing thread that needs to be explitly set to have >= 8 priority? I'm thinking like maybe somewhere in https://github.com/juce-framework/JUCE/blob/master/modules/juce_audio_devices/native/juce_linux_JackAudio.cpp ...though maybe I'd have to look more...I would think setting realtime priority should be already being done somewhere though, cause JUCE should really want it realtime cause there is a deadline to finish processing the audio in time. (At least that is the whole purpose why I install a realtime kernel!)

It seems to me this JUCE change would affect regular x86/x86-64 linuxes in addition to raspberry pi, so I'm eager to fix it.

(I have a raspberry pi 4 running your realtime kernel 5.15.18-rt28-v8+, and I have a brother I jam with who I manually compile sonobus for their raspberry which they run arch linux on.)

ericfont avatar Feb 08 '22 18:02 ericfont

Eric, the important threads I am setting that high priority for manually, and the high priority audio thread is provided by the system audio driver, all of those still have their normal real-time priorities with both the old and the new JUCE code. The difference seems to be in all the other threads that get created by default, in the old version they had a slightly elevated priority, but in the new one they do not.

The thing I don’t understand is why they are affecting anything important in the application.

essej avatar Feb 08 '22 18:02 essej

ok, I will keep an eye out. I have just built the latest git sonobus and running on x86-64 5.10.83.58 realtime kernel with jack buffer size 6 @44.1kHz...I just started sonobus and don't see anything funny, but I'll have to find someone to jam with. I'll have to try on raspberry later.

@kdoren could you describe more specifically...when you say "drops packets" do you mean UDP packets or local audio frames?

ericfont avatar Feb 08 '22 20:02 ericfont

ok, I will keep an eye out. I have just built the latest git sonobus and running on x86-64 5.10.83.58 realtime kernel with jack buffer size 6 @44.1kHz...I just started sonobus and don't see anything funny, but I'll have to find someone to jam with. I'll have to try on raspberry later.

@kdoren could you describe more specifically...when you say "drops packets" do you mean UDP packets or local audio frames?

Pretty sure he just meant audio glitches of unknown origin. After all, the "drops" stat really could be for any reason, it actually doesn't only mean dropped packets... it really just means data that didn't make it in time.

essej avatar Feb 08 '22 20:02 essej

I just jammed for 30 minutes on x86-64-rt...didn't notice any unusual glitches.

ericfont avatar Feb 08 '22 21:02 ericfont

so I've just built on Raspberry and am running jack in dummy mode 16-frames@48kHz (aarch64 5.15.18-rt28-v8+ realtime), and ran the raspberry output into input and on my x86-64-rt computer...and everything works fine...jitter buffer set at 1ms and hear myself fine. I can try testing somemore later but any specifics on how to reproduce the issue on RPi realtime (and your exact kernel number) would be nice.

ericfont avatar Feb 08 '22 22:02 ericfont

@kdoren does your issue happen when you are using the jack dummy interface, or does it only happen when you use a real interface?

ericfont avatar Feb 08 '22 22:02 ericfont

@ericfont I use a real jack interface, with either USB interface or HAT card (hifiberry).
jackd2 1.9.20, period = 64 (nperiods = 3). But I saw issues even when I tried longer periods. Low-latency kernel. (PREEMPT_RT kernel actually gives slightly worse results on RPi, apparently due to FIQ spin-locks.)

There definitely seem to be threads that needs realtime scheduling but are not getting it. I can make the errors stop by using chrt to change the scheduling (see below)

Running a test between unpatched sonobus vs patched (good): Each side sending 2 channels at 160kbps, with fixed 4 ms jitter buffer. With both sides patched or running 1.4.6, this can run for days with no dropped packets or syslog errors.

FYI error-free operation of 1.4.6 requires launching SonoBus with main thread set to rr priority 40 (sonobus start script on jambox image does this). This does not work on 1.4.9, but manually setting threads does work.

on unpatched 1.4.9, on box with hifiberry HAT card, launch sonobus with no chrt priority setting:

pi@jambox:~ $ chrt -a -p 10576
pid 10576's current scheduling policy: SCHED_OTHER
pid 10576's current scheduling priority: 0
pid 10581's current scheduling policy: SCHED_RR
pid 10581's current scheduling priority: 50
pid 10582's current scheduling policy: SCHED_RR
pid 10582's current scheduling priority: 50
pid 10583's current scheduling policy: SCHED_OTHER
pid 10583's current scheduling priority: 0
pid 10584's current scheduling policy: SCHED_OTHER
pid 10584's current scheduling priority: 0
pid 10589's current scheduling policy: SCHED_OTHER
pid 10589's current scheduling priority: 0
pid 10590's current scheduling policy: SCHED_OTHER
pid 10590's current scheduling priority: 0
pid 10591's current scheduling policy: SCHED_FIFO
pid 10591's current scheduling priority: 85
pid 10592's current scheduling policy: SCHED_OTHER
pid 10592's current scheduling priority: 0
pid 10593's current scheduling policy: SCHED_OTHER
pid 10593's current scheduling priority: 0
pid 10594's current scheduling policy: SCHED_OTHER
pid 10594's current scheduling priority: 0
pid 10595's current scheduling policy: SCHED_OTHER
pid 10595's current scheduling priority: 0

This gives packet drops in SonoBus (both ends) syslog shows periodic bursts of errors:

Feb  9 01:14:48 jambox jackdrc[10557]: JackEngine::XRun: client = SonoBus was not finished, state = Running
Feb  9 01:14:48 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackEngine::XRun: client = SonoBus was not finished, state = Running
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:55 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:58 jambox jackdrc[10557]: JackEngine::XRun: client = SonoBus was not finished, state = Running
Feb  9 01:14:58 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error
Feb  9 01:14:58 jambox jackdrc[10557]: JackAudioDriver::ProcessGraphAsyncMaster: Process error

While SonoBus is running, change process scheduling to look like 1.4.6 or patched 1.4.9: This has same effect as my patch that reverts the JUCE commit. I don't know if some subset of these changes would be sufficient.

pi@jambox:~ $ chrt -r -p 40 10576
pi@jambox:~ $ chrt -r -p 50 10583
pi@jambox:~ $ chrt -r -p 50 10584
pi@jambox:~ $ chrt -r -p 50 10592
pi@jambox:~ $ chrt -r -p 20 10593
pi@jambox:~ $ chrt -r -p 50 10594
pi@jambox:~ $ chrt -r -p 69 10595

Results:

pi@jambox:~ $ chrt -a -p 10576
pid 10576's current scheduling policy: SCHED_RR
pid 10576's current scheduling priority: 40
pid 10581's current scheduling policy: SCHED_RR
pid 10581's current scheduling priority: 50
pid 10582's current scheduling policy: SCHED_RR
pid 10582's current scheduling priority: 50
pid 10583's current scheduling policy: SCHED_RR
pid 10583's current scheduling priority: 50
pid 10584's current scheduling policy: SCHED_RR
pid 10584's current scheduling priority: 50
pid 10589's current scheduling policy: SCHED_OTHER
pid 10589's current scheduling priority: 0
pid 10590's current scheduling policy: SCHED_OTHER
pid 10590's current scheduling priority: 0
pid 10591's current scheduling policy: SCHED_FIFO
pid 10591's current scheduling priority: 85
pid 10592's current scheduling policy: SCHED_RR
pid 10592's current scheduling priority: 50
pid 10593's current scheduling policy: SCHED_RR
pid 10593's current scheduling priority: 20
pid 10594's current scheduling policy: SCHED_RR
pid 10594's current scheduling priority: 50
pid 10595's current scheduling policy: SCHED_RR
pid 10595's current scheduling priority: 69

Now the errors stop. SonoBus stops dropping packets, and syslog errors stop.

I build another application which uses JUCE 6.1.2 (JammerNetz). I have not done extensive testing, but it does not appear to have similar problems. Only one thread has FIFO priority, all others are Other prioriry 0. It uses jack PERIOD=128, but I have seen SonoBus error at that setting.

kdoren avatar Feb 09 '22 11:02 kdoren

hmm...so if it is one or more of these specific threads:

pi@jambox:~ $ chrt -r -p 40 10576
pi@jambox:~ $ chrt -r -p 50 10583
pi@jambox:~ $ chrt -r -p 50 10584
pi@jambox:~ $ chrt -r -p 50 10592
pi@jambox:~ $ chrt -r -p 20 10593
pi@jambox:~ $ chrt -r -p 50 10594
pi@jambox:~ $ chrt -r -p 69 10595

Then it seems the way I would debug it is to run sonobus in debug mode in an IDE and see what those threads are. And try changing their priority one by one.

I also have a hifiberry (DAC+ ADC Pro) and could test this to verify if I can get the same issue. I feel this issue would happen with dummy jack as well.

ericfont avatar Feb 09 '22 12:02 ericfont

I'm not noticing any issue right now...I'm running jack dummy on rasapberry pi 4 aarch64 and real interface on x86-64 computer, both realtime kernel with realtime jack priority with 64-frame buffers @44.1 KHz, connected over lan, with 2-ch 160 kbps/ch and 4 ms jitter buffer. No drops at all.

ericfont avatar Feb 09 '22 12:02 ericfont

I just now went to follow https://github.com/kdoren/jambox-debs#readme though I'm noticing that doesn't work aarch64...maybe I should just flash a fresh image of your jambox now if that is what your rpi4 is currently running.

ericfont avatar Feb 09 '22 12:02 ericfont

i'm installing https://github.com/kdoren/jambox-pi-gen/releases/download/v1.5.0/image_2021-11-20-Jambox_pi-gen_v1.5.0.zip now...

ericfont avatar Feb 09 '22 12:02 ericfont

well I installed that jambox-pi on my rpi4 with hifiberry. But I couldn't compile sonobus latest git. See attached errors err.txt which has a lot of the following compile errors:

juce_Atomic.h:58:24: error: static assertion failed: This class can only be used for lock-free types
         static_assert (std::atomic<Type>::is_always_lock_free,

ericfont avatar Feb 09 '22 14:02 ericfont

So at this point my feeling is there is something particular about the jambox image setup which creates issue with latest sonobus.

ericfont avatar Feb 09 '22 14:02 ericfont

(same compile errors when I build with a checkout of 1.4.9 tag instead of master.)

ericfont avatar Feb 09 '22 14:02 ericfont

if you could provide me any other details about your system setup that would be nice, but at the moment I can't help any.

ericfont avatar Feb 09 '22 14:02 ericfont

(same compile errors when I build with a checkout of 1.4.9 tag instead of master.)

For some reason kevin needs to use this before running ./setupcmake.sh export CXXFLAGS="-march=native" export CFLAGS="-march=native"

The cmake should have been adding this on ARM architectures but apparently it isn't quite working sometimes.

essej avatar Feb 09 '22 15:02 essej

ok, well I've built 1.4.9 now with those export flags, and I can run it on my pi, with those same settings (64-frame @48kHz, send/receive 2 ch @160 kbps) over my local network between my x86-64 machine and the rpi4. But I don't see any drops reported. So I don't know how to reproduce kevin's issue.

ericfont avatar Feb 09 '22 16:02 ericfont

one drop, that is it so far in the last 5 minutes.

ericfont avatar Feb 09 '22 16:02 ericfont

still only 40 drops over the last 2 hours and 20 minutes.

ericfont avatar Feb 09 '22 19:02 ericfont

do you see syslog errors?

kdoren avatar Feb 09 '22 19:02 kdoren

what type of errors should I look for...I've copied my syslog here: syslog.txt

ericfont avatar Feb 09 '22 19:02 ericfont

quite a lot of jambox jackdrc[429]: JackAudioDriver::ProcessGraphAsyncMaster: Process error

ericfont avatar Feb 09 '22 19:02 ericfont

Yes those kinds of errors during sonobus operation are what I have been seeing. When I set process scheduling as described above, or patch to revert that commit, I can run for days with no drops and no syslog errors.

I haven't tried on a stock raspian image (latest is running bullseye).

I'm not too far from being able to generate a jambox image based on debian/raspian bullseye. I'll try again on that.

kdoren avatar Feb 09 '22 20:02 kdoren