oboe icon indicating copy to clipboard operation
oboe copied to clipboard

[SAMPLE REQUIRED] Oboe best practices for low latency missing or unanswered in samples (CPU frequency scaling & thermal throttling)

Open Diljeet opened this issue 6 years ago • 6 comments

Hi everyone (Mr Phil and others) I watched a lot of videos of Google I/O about low latency audio, before and after oboe.

I have a synthesizer app on Google Play named Harmonium

Here is what i found everything on low latency:

  • Use native c++ code instead of java, i benchmarked it on my personal experience from dual to octa core devices - ics to oreo
  • use device sample rate, burst size and native sample format to avoid any mixer or resampler during output, ultimately for getting the fast path
  • also performance mode - low latency and sharing - exclusive for fast path
  • before oboe their was a trick for fake touches, for not letting the cpu to sleep (Mr Dan at google io 2016 suggested this hack, i was using it before oboe)
  • buffer size - as suggested should be minimum twice the burst, but we can use the latency tuner (or under run count to tune it)
  • Core migration - i saw set affinity example in oboe MegaDrone sample
  • CPU frequency scaling - i haven't found anything on this in samples but in Google I/O 2017 Mr Don Turner said to stabilize the load, to avoid cpu frequency from dropping, he said stabilizing could be done gating voices on/off or assembler no operations instruction (i have no idea what is all this instruction stuff). But i saw a implementation in SimpleSynth sample(https://github.com/googlesamples/android-audio-high-performance/tree/master/SimpleSynth), its old code used before android O & P, so i don't know whether to use it or not, i also don't want the cpu to drain battery by performing useless calculations to keep the frequency up during idle state of foreground app.
  • Thermal throttling - probably sustained performance mode will help in this thing

So my request is simply an oboe example, where we can see all these feature implemented at once along with each other :-

  • device sample rate, burst size, sample format, performance low latency, sharing exclusive. these are already implemented in most of the samples
  • we request Thermal Throttling, CPU frequency scaling, Core migration (affinity getexclusivecores in megadrone sample) and latency tuner to be put in a sample along with above already implemented features.

And if any of these issues doesn't need handling, then please explain it in the oboe readme, so user will not be worried about dealing with certain issues.

Waiting for your response, will release my next update as soon as this topic is answered or after looking at a sample (making a sample could be easy, i can put them all together, but i don't know the after math, you guys understand the framework from the inside, so it will be better if you make it and test the sample, and also most probably add some new and better optimizations).

Diljeet avatar Aug 12 '19 12:08 Diljeet

The reason that i asked for a sample to handle all these situation is this : Suppose we have a double buffer 2 x burst size, it has no glitches and all of a sudden our audio render thread is shifted to another core and we have a glitch / underrun and now that latency tuner will increase the buffer size to 3 x burst if burst is 4ms then 4ms more the latency and that core shifting may not happen again but still our app latency will suffer

Same with thermal throttling if cpu gets hot and we glitch our buffer size is increased however if we use the app key press by key press then the cpu may not be hot always and we don't need the extra buffer size

Same with cpu frequency scaling suppose i am pressing keys on my synthesizer app, and i stopped for a while due to silence in the song i am playing, then i hit the key again and at that time the cpu was unable to increase upto the required frequency fast which resulted in a glitch/underrun and latency tuner increased the buffer size due to the underrun and now we have an extra latency 4ms(burst size) which shouldn't be there

so i request an example which deals with core migration, frequency scaling, thermal throttling along with latency tuning. Please sir i hope for a quick response (as my app suffers underruns), this example will also be useful for others developing low latency apps.

Good Day @philburk @dturner Please Mr. Phil, Nr. Don & whole Team you helped me the last time, i need it again. Thanks

Edit :-

I found that oboe has a StabilizedCallback class used in MegaDrone sample, i am trying to use it. Still i found "Trace" statements in it, is it ok to use the callback with the tracing? Mega Drone sample has solved Core Migration and CPU Frequency Scaling, but it ignores Latency Tuning (LatencyTuner) and Thermal Throttling (Sustained Performance), are they not necessary to implement? & what could be the down side in implementing them?

Diljeet avatar Aug 13 '19 08:08 Diljeet

Thanks for your suggestion. MegaDrone and hello-oboe do most of these things already but will consider adding sustained performance mode and consolidating a few things into a single sample, probably MegaDrone.

Still i found "Trace" statements in it, is it ok to use the callback with the tracing?

Tracing is fine to use inside the callback, it doesn't block.

dturner avatar Aug 14 '19 16:08 dturner

Thanks for the reply, waiting for the sample update.

Diljeet avatar Aug 14 '19 18:08 Diljeet

I have one question and need one advice too.

@dturner Hello Mr. Don thanks for all the help and guidelines. i successfully implemented StabilizedCallback, SetCpu (getexclusivecores) and latency tuning in my app. I tested on all available devices i have.

Question: I saw that StabilizedCallback uses 80% of the burst time for processing (constexpr float kPercentageOfCallbackToUse = 0.8;), is this a safe and tested? and also when exiting the app is it not necessary to check the stream state in StabilizedCallback-generateLoad function (coz it may cause a delay even when we stop the stream, that extra 2-3 ms generateload waiting during exit i.e. in stop state could result in ANRs). Sorry for my concerns. I suggest

generateLoad(stabilizingLoadDurationNanos);
// to
generateLoad(oboeStream, stabilizingLoadDurationNanos);

void StabilizedCallback::generateLoad(int64_t durationNanos) {
.
while (currentTimeNanos <= deadlineTimeNanos){
.
}
// to
void StabilizedCallback::generateLoad(AudioStream *oboeStream, int64_t durationNanos) {
.
while (currentTimeNanos <= deadlineTimeNanos && oboeStream->getState() == StreamState::Started){
.
}

Advice Needed: I can easily implement Sustained Performance mode in my app, i just want your opinion on doing so or not. At Google I/O 2017 you said that stabilizing load should work best with Sustained Performance mode, yet it is never used in the samples. As sustained mode came before Oreo and AAudio, i want to know is it required or not? Currently oboe samples use very low processing in Audio Callback, but real apps does a little more sometimes, what could be the benefit or downside in using it, please tell me your experience on this, not the theory but actual testing results.

Diljeet avatar Aug 17 '19 09:08 Diljeet

Maybe we could release the "OboeSynth" MIDI app for this. It is a reasonable starting point for apps.

philburk avatar Aug 20 '19 16:08 philburk

We need a more robust way of generating load which will never be optimized out by the compiler.

dturner avatar Aug 20 '19 16:08 dturner

All of the techniques that we recommend for low latency and high performance are used in the examples. Techniques involving CPU affinity, or stabilized callback are not recommended. Thermal throttling and frequency scaling are not directly controllable by the app.

We did add a new PerformanceHint to the AudioStream. It is experimental. But it may help on some devices.

https://google.github.io/oboe/classoboe_1_1_audio_stream.html#ad588d956ff4bb9669f37ec143b94818d

philburk avatar Jun 15 '23 21:06 philburk