media How to prevent offload gapless playback based on metadata?

I'm working on a ReplayGain implementation for my music player. To ensure battery usage is low, I want to make offload work with it as well. ReplayGain handles files with low volume by increasing the volume above 100%. Clipping is considered, so this is only done if a file has a low peak volume and silence in upper part of headroom. Increasing volume above 100% without compression (not needed since after increasing the volume, files still won't clip) can only be achieved using DynamicsProcessing in Android, not using AudioTrack.setVolume() which is capped at 100%. So I need to update DynamicsProcessing parameter synchronized with song change. However when using gapless offload, it's hard to time the change correctly as it already processes start of next file before current one is done playing. Hence I need to prevent offload gapless if gain changes. However offload gapless should still work if gain is the same (to allow gapless albums to still be played gapless). The gain is stored in Format.Metadata tags. I am implementing this in a decorating AudioSink to ensure I can update effect settings before frames are written to AudioTrack. Is there any way to prevent offload gapless this way if gain is different?

Oct 17 '25 08:10 nift4

Just to check on this point:

it's hard to time the change correctly as it already processes start of next file before current one is done playing

I thought the audiofx implementations (including DynamicsProcessing) were implemented in the HAL so any changes take effect almost immediately even when using offload. So you just need to time the reconfiguration of the audio effect to the moment at which the new track starts playing? (Synchronization/timing still won't be perfect, but if it's a seamless, gapless transition then metadata shouldn't change drastically between the pieces of the content.)

To put it another way, I think you need to time the changes based on the player position, not based on the audio writing position (which can be way ahead in media time when using offload).

Oct 17 '25 11:10 andrewlewis

so any changes take effect almost immediately even when using offload

I am not sure if HAL guarantees that, I would assume HAL uses large-ish buffers internally to save power and hence changes are applied with some latency. My assumption is that if I have not given it any frames, then it cannot have already processed them, so this would allow perfect synchronization as long as there's a gap. But I did not test this properly so far (even if I did, I'd need to test all major SoC vendors seperately for the results to be meaningful, I suppose.)

(Synchronization/timing still won't be perfect, but if it's a seamless, gapless transition then metadata shouldn't change drastically between the pieces of the content.)

Gapless transition can also occur eg. between two MP3 files "by chance", I believe, just because they both have the gapless metadata. But nothing says that because metadata is there, the files "belong" together, so if user plays first a quiet track (volume will be increased) then a loud track (volume will be decreased) and volume decrease applies to late, already loud song will be even louder due to gain which would blast user's ears ;).

Oct 17 '25 11:10 nift4

Yes, there will be a large buffer but I suspect (and hope) there's a requirement for effects to be applied close to the playback head position even during offload. Will check this and get back to you.

And yes, you are right: if there's a mixture of tracks from different albums (for example) there can still be gapless metadata. If these are not actually seamless it's less of an issue.

Oct 17 '25 12:10 andrewlewis

Since I face the same thing, the issue is that a track can have a -12db as RG value, because it's insanely loud, if it's loud from the first frame and the previous track only had a -1db it's a 11 DB too loud for a couple of frames and dangerous.

The DynamicsProcessing buffer is based on the setPrefrredFrameDuration and even with approximations based on the value you can't ensure no too loud frames as even if too early the opposite order of tracks with loud ending and quiet next track have the same risk.

Oct 17 '25 12:10 Tolriq

Confirmed with the audio framework team: "Classic (and PCM) offload, the effects are also offloaded (i.e. they run in the [always on compute core]), so any user interaction is applied very close to the playout position with low latency."

Oct 17 '25 15:10 andrewlewis

For this case it sounds like it's important that the change to the effect doesn't apply even slightly before the end of the first track or after the beginning of the second track, so it may be best to insert silence to give leeway for the effect change to be slightly earlier/later, or to pause at the end of the first track (without writing further data), reconfigure and resume writing.

Oct 17 '25 15:10 andrewlewis

Thanks for checking with the audio team, that's helpful to know.

For this case it sounds like it's important that the change to the effect doesn't apply even slightly before the end of the first track or after the beginning of the second track

Yeah, that's the problem.

or to pause at the end of the first track (without writing further data), reconfigure and resume writing

...which is practically equavilent what I was asking for when I made the OP :)

Inserting silence is actually an interesting idea, but I'm not sure how to synthesise silent frames for all the audio formats one can play (and I saw AC-4 offloading before where I know I'm not going to figure it out either as it's a very closed format), so I think pausing playback, reconfiguring and then writing again is the way to go.

Oct 17 '25 15:10 nift4

Hey @nift4. We need more information to resolve this issue but there hasn't been an update in 14 weekdays. I'm marking the issue as stale and if there are no new updates in the next 7 days I will close it automatically.

If you have more information that will help us get to the bottom of this, just add a comment!

Nov 05 '25 18:11 google-oss-bot

@andrewlewis hi, it seems the bot decided to not remove the label previously. do you have any advice for how I would do something like this? thanks!

Nov 05 '25 18:11 nift4

The only solution I can think of for now is to allow the track to finish playout, reconfigure the audio effects and then resume playback at the next track, as you suggested earlier. Are there gaps in the APIs you need to implement that?

If they are gapless tracks in an album this won't work, and I don't think we have a good way to do it with in-framework audio effects. But it's also not clear to me if it makes sense for effects to change at gapless track boundaries. Do you have a concrete case/example media for that?

Nov 06 '25 11:11 andrewlewis

The only solution I can think of for now is to allow the track to finish playout, reconfigure the audio effects and then resume playback at the next track, as you suggested earlier. Are there gaps in the APIs you need to implement that?

My understanding is that this is completely handled by audio sink:

      if (!drainToEndOfStream()) {
        // There's still pending data in audio processors to write to the track.
        return false;
      } else if (!pendingConfiguration.canReuseAudioTrack(configuration)) {
        playPendingData();
        if (hasPendingData()) {
          // We're waiting for playout on the current audio track to finish.
          return false;
        }
        flush();
      } else {
        // The current audio track can be reused for the new configuration.

My current code is an audio processing chain which works mostly well for all the other things I have to do (except that I have to use ForwardingAudioSink's help to get the Format.metadata to audio processor - but there's already a feature request for this - see #418), but I am not sure how I would trigger canReuseAudioTrack() to be false to get a new audio track. I cannot just call flush() in configure with my ForwadingAudioSink as it would not play end of song.

If they are gapless tracks in an album this won't work, and I don't think we have a good way to do it with in-framework audio effects. But it's also not clear to me if it makes sense for effects to change at gapless track boundaries. Do you have a concrete case/example media for that?

No, they don't have to change if it's actually one album. The gain stays the same in that case. So that limitation is OK for me.

Nov 06 '25 12:11 nift4

As an aside since you mention the audio processing chain: that processing can be perfectly/deterministically synchronized with playout unlike (AFAIK) the framework audio effects. But I'm assuming you need framework audio effects for this because you want to use compressed offload. There is a plan to add support for PCM offload where audio processors would work, but it's not going to be implemented soon so won't help in the short term.

For letting things play out I was thinking you'd actually let the player transition to the ended state, then set the effects, then set the next media item at the player level. This is a bit hacky though.

I am not sure how I would trigger canReuseAudioTrack() to be false to get a new audio track.

Looking at the code at tip-of-tree, could you try passing a custom AudioOffloadSupportProvider that uses a different isGaplessSupported value based on the metadata? It looks like that value then becomes part of the configuration and should cause playing pending data when it toggles. (I haven't tried this and it's quite likely I'm missing something so also adding @microkatz to review or suggest a better approach.)

Nov 06 '25 12:11 andrewlewis

But I'm assuming you need framework audio effects for this because you want to use compressed offload.

Sorry, yes, you're correct. I confused it with non-offload case, for offload I get events from forwarding audio sink.

For letting things play out I was thinking you'd actually let the player transition to the ended state, then set the effects, then set the next media item at the player level. This is a bit hacky though.

I rely on ExoPlayer to do my shuffling and playlist so that would require major surgery, on top of being somewhat hacky.

Looking at the code at tip-of-tree, could you try passing a custom AudioOffloadSupportProvider that uses a different isGaplessSupported value based on the metadata? It looks like that value then becomes part of the configuration and should cause playing pending data when it toggles.

It doesn't seem to have access to both current and next Format. I can try caching it in ForwardingAudioSink for previous value though, I'll try that and report back (I'd appreciate any other ideas if they come to mind, though).

Nov 06 '25 14:11 nift4

I think that can't work properly, actually. If I start playing song A (gain=2dB), I don't yet know what the next song will be (user might edit playlist while song is playing, so it's pointless to try and figure it out now). So I leave useOffloadGapless default for the format, let's say true. Then song B starts playing with different gain of 0dB so need to prevent gapless. Hence set useOffloadGapless to false. But now if song C is same gain, album & format as B so offload gapless can be true but I can't change it back to true for C anymore, it's already decided there won't be gapless for song B because configuration is no longer compatible as useOffloadGapless is not the same.

Nov 07 '25 08:11 nift4

Makes sense. The other thing that might be worth trying is (again just looking at tip-of-tree) changing audio session ID via a custom AudioOutputProvider with AudioOutputProvider.getFormatConfig overridden to change audio session ID for non-gapless transitions to force recreation. This means before/after can use the same gapless offload config. If that works, it may be tidier anyway because I think you can create the audio session ID and attach effects to it. (Also reassigning @microkatz for follow-ups as I won't be able to reply for a bit.)

Nov 07 '25 14:11 andrewlewis

overridden to change audio session ID for non-gapless transitions to force recreation.

I'm a bit concerned about this because external equalizer apps usually don't expect audio session to change often. One of the most popular in my userbase for example doesn't support multiple audio sessions at all so I need to tear down old session before I can start using new one. If I constantly tear down audio session during playback and broadcast new one it would likely cause some weirdness with these external EQ apps.

While some equalizer apps also use DynamicsProcessing and I would conflict with those, many also just use Equalizer effect which would work fine even when using ReplayGain in offload using DPE.

Do you think there's another approach that could be used here? Or maybe a chance for a new hook point somewhere?

Nov 08 '25 19:11 nift4

Hey @nift4. We need more information to resolve this issue but there hasn't been an update in 14 weekdays. I'm marking the issue as stale and if there are no new updates in the next 7 days I will close it automatically.

If you have more information that will help us get to the bottom of this, just add a comment!

Nov 27 '25 18:11 google-oss-bot