oboe icon indicating copy to clipboard operation
oboe copied to clipboard

Underruns when new ad is loaded by Admob: Affinity fixes it

Open peterdk opened this issue 7 months ago • 9 comments

I have a heavily used app (Mini Piano Lite) that is using Oboe for a long time already. Recently I have improved my analytics setup and am also reporting underruns per session. This is quite insightful, cause I have around 14% of my users experiencing 1+ underruns and 5% 3+ underruns. I did a lot of optimization, like removing mutexes from the audiopath, lots of profiling with Perfetto, and generally my audio pipeline is quite optimized throughout the years. However, still the worrisome 5% 3+ underruns remain.

I also collect device models, and bought a few (older) devices that had the most underrun sessions happening and relatively the most underruns per session. (Galaxy Tab S6 Lite, Galaxy A50, Galaxy A51, but also Pixel 8(!)). So time to profile a bit more with Perfetto, since the underruns happened very frequently and were consistently reproducible. After lots of experimenting it turned out that the Admob banner that is always visible and refreshes every 1 minute is the direct cause of the stuttering/underruns. Even when no audio plays (so no load, only a zero filled memset buffer) on ad change I get 10 underruns. Quite frustrating.

So I was planning to file a bug to get some help from you guys. Cause all ad stuff is optimized, using only Main thread when required, and all Google's recommendations and background thread loading settings applied. But I came across a Oboe Github Discussions thread about underruns and there the example datacallback was shared with ThreadAffinity setup code. I tried it, and wow.... all my issues are gone!

So I think Admob causes some thread rescheduling / core moving when loading ads (on some devices), and setting Thread Affinity fixes that completely.

Ok, long story, but the thing is, I never found this recommendation in any Oboe docs (and I read a lot of them through the years). It would be really great to add somewhere a section on performance issues, and talk there a bit about this thread affinity. Maybe even consider this making a option for the builder and handling this Thread Affinity stuff in Oboe itself, and not rely on the callback code of each user. I can't be the only one using Oboe in a app with Admob.

Anyway, really looking forward to the analytics when I release this, but it seems a real gamechanger!

peterdk avatar May 04 '25 14:05 peterdk

Oh, using performance hint did not fix it, I tried that before stumbling on the affinity example.

peterdk avatar May 04 '25 20:05 peterdk

There are two primary known issues with CPU Performance on Android. These affect apps that have a non-trivial workload, like synthesizers. These apps require that the CPUs run at moderate frequencies and sometimes they have to run on bigger CPUs.

  1. When an app's workload suddenly increases, the CPU must speed up and the task may need to move to a bigger core to handle it. Unfortunately it can take about 100 msec to reach the required frequency and the 4 msec buffer has underflowed long ago. Using a steady workload will avoid this problem but, unfortunately, that consumes more battery power.

  2. When a task switches CPUs then it loses the information about the required performance. So the frequency drops to the minimum and we have the problem in #1. Setting CPU affinity will help but that also interferes with the scheduler. So we have not recommended it.

Neither of these workarounds are healthy for the system. But you gotta do what you gotta do.

We are working with the kernel teams to solve these problems.

setting Thread Affinity fixes that completely.

@peterdk - How did you decide which CPU to use?

philburk avatar May 11 '25 20:05 philburk

Thanks for the info.

I did a lot of testing this week, and settled for setting affinity to 2 cores. That gave best results across old and new devices. I experimented with performance hint callback enabled, but that seemed not to have improved results (or even having negative impact).

I did a lot of research and found that you can use /sys/devices/system/cpu/cpuX/cpu_capacity for each available CPU core. This indicates (relative/normalized) performance of each core (complex). If a kernel/device is very old (and doesn't provide cpu_capacity), I use /sys/devices/system/cpu/cpuX/cpufreq/cpuinfo_max_freq. When you group each core by these values, what you get is several core complex groups. Most of the older devices have a high performance core group (core 0-3) and then a low performance group (4-7), ofcourse depending on number of cores and CPU hardware. More recent devices have 3 groups with 1 top core, 4 medium and 4 low. Anyway, you can just parse these values and then do some logic to determine what you want to use.

For my app I settled now on taking 2 cores from top group (in case of 1 - 2 core groups) or 2 from the medium group ( in case of 3+ core groups). It seems best to avoid the low performance core groups, and also it seems to be optimal to not set affinity to cores from different groups. I noticed a lot more underruns when the scheduler was allowed to move the thread from top core to medium core. Also when you set affinity to 1 core, the main thread (or some Admob worker) might (somehow) sometimes be scheduled on that same core, and then you get massive underruns. Therefor the 2 cores.

I understand that it's totally tuned to what your app is doing, but this is my experience. And even though there might still be the rare underrun, at least when a ad loads there is no impact anymore. Which is a big win for my banner ad supported app.

I am planning to slowly run this out with the new version of my app and will monitor my analytics to see if it has no adverse effects. Will update here when I have some numbers.

peterdk avatar May 11 '25 21:05 peterdk

During testing I still had issues unfortunately. So I did postpone rollout with the affinity changes. But it's certainly a topic I will explore more, since knowing about the core(group)s gives insight in it's behaviour.

peterdk avatar May 25 '25 11:05 peterdk

What type of issues did you run into? Was it worse that not enabling it?

robertwu1 avatar May 27 '25 18:05 robertwu1

On a Pixel 8 device I got bursts of underruns sometimes (around 50 in one go ). Which was notably worse then without setting affinity. Looking into profiling with perfetto and simpleperf. Tested several different affinity strategies. Not yet conclusive.

peterdk avatar May 27 '25 18:05 peterdk

Interesting! Have you tried enabling Adpf with and without affinity? Can you send me a bugreport on the pixel 8 via email (robertwu at google dot com) so I can report this internally to the pixel team?

robertwu1 avatar May 27 '25 18:05 robertwu1

Yes, I will test more the coming weeks. It's not only limited to Pixel 8, I also had issues with older samsung devices I tested. So Yeah, I plan to do a more thorough run to see if I can pinpoint a issue. (Be it my code, be it scheduling behaviour in combination with admob, or something else).

peterdk avatar May 27 '25 18:05 peterdk

Thanks! Let me know how your investigation goes!

robertwu1 avatar May 27 '25 18:05 robertwu1