media icon indicating copy to clipboard operation
media copied to clipboard

Excessive targetBufferSize in DefaultLoadControl Can Lead to OOM

Open cucumbersw opened this issue 2 months ago • 2 comments

Version

Media3 1.8.0

More version details

No response

Devices that reproduce the issue

Pixel6Pro Android 15

Devices that do not reproduce the issue

No response

Reproducible in the demo app?

Yes

Reproduction steps

Affected Component: ExoPlayer / DefaultLoadControl

📝 Description of the Problem

I have observed a potential scenario where DefaultLoadControl.calculateTargetBufferBytes() returns an excessively large target buffer size, which can lead to OutOfMemory (OOM) errors, especially when playing high-bitrate content on resource-constrained devices.

The issue is related to how the load control calculates the required buffer for multiple tracks, combined with how certain non-standard tracks are classified during track selection.

Detailed Scenario

  • High-Bitrate Content: A large MP4 file with a very high-resolution and high-bitrate video track is being played.

  • Uncommon Track: The file includes a private or non-standard Metadata track (e.g., custom format).

  • Track Misclassification: During the TrackSelection process, because the metadata track is of an unknown or non-standard type, it defaults to being classified based on the container's MIME type. In this specific case, it may be incorrectly treated as a second Video track.

  • Excessive Buffer Calculation: The DefaultLoadControl then calculates the required buffer size based on two "video" tracks and one audio track. Given the already high bitrate of the actual video, and the default long buffer duration (e.g., 50 seconds), the resulting targetBufferSize becomes prohibitively large (e.g., exceeding 256MB).

Consequence: This aggressive and potentially incorrect memory reservation rapidly consumes the heap, causing an OOM error.

💡 Proposed Solution

The root cause (track misclassification) may be hard to fix generically for all custom metadata tracks, but the OOM outcome can be easily mitigated by introducing a safeguard in the DefaultLoadControl.

I propose adding a maximum hard cap to the final calculated buffer size within the calculateTargetBufferBytes() method.

For example, if a constant MAX_BUFFER_SIZE (e.g., 128MB or a suitably determined value) is defined, the method should ensure the return value never exceeds this limit:

// Inside DefaultLoadControl.calculateTargetBufferBytes()
// ... existing calculation logic ...

int targetBufferSize = // calculated value

// Add a check to prevent excessive buffer allocation
if (targetBufferSize > MAX_BUFFER_SIZE) {
    return MAX_BUFFER_SIZE;
}

return targetBufferSize;

This simple fix would prevent runaway memory allocation due to edge-case track misclassification, offering better stability without sacrificing the buffering performance of well-formed streams.

Expected result

No OOM error

Actual result

OOM crash

Media

You can simply create a MP4 containing 4K resolution and 64mbps video, with a metadata track of mimetype "application/xyz_private" and an audio track. When play the MP4 with ExoPlayer, add a dummy metadata renderer to make sure the metadata track is also selected.


@SuppressLint("UnsafeOptInUsageError")
fun startPlayer(playerView: PlayerView, uri: Uri) {

    if (exoPlayer == null) {
        val renderersFactory = MetaRenderersFactory(this)
        exoPlayer = ExoPlayer.Builder(this, renderersFactory)
            .build().apply {
            addAnalyticsListener(EventLogger())
            playWhenReady = true
            repeatMode = Player.REPEAT_MODE_ONE
        }
    }

    playerView.player = exoPlayer!!
    val mediaSource = ProgressiveMediaSource.Factory(DefaultDataSource.Factory(this))
        .createMediaSource(MediaItem.fromUri(uri))
    exoPlayer!!.setMediaSource(mediaSource)
    exoPlayer!!.prepare()
}

@UnstableApi
data class DummyMetaFrame(val mimeType:String, val data: ByteBuffer, val timeUs: Long): Metadata.Entry {
    override fun getWrappedMetadataFormat(): Format? = null //return null indicates no wrapped metadata need to be parsed
    override fun getWrappedMetadataBytes(): ByteArray? = data.toByteArray()
}

@UnstableApi
class DummyMetadataDecoder(private val mimeType: String): MetadataDecoder {
    companion object {
        val factory = object: MetadataDecoderFactory {
            override fun supportsFormat(format: Format) = ("application/audio_meta" == format.sampleMimeType)
            override fun createDecoder(format: Format) = DummyMetadataDecoder(format.sampleMimeType?:"UnknownMime")
        }
    }

    override fun decode(inputBuffer: MetadataInputBuffer): Metadata {
        return Metadata(DummyMetaFrame(mimeType, inputBuffer.data!!, inputBuffer.timeUs))
    }
}

@UnstableApi
class MetaRenderersFactory(context: Context): DefaultRenderersFactory(context) {
    override fun buildMetadataRenderers(
        context: Context,
        output: MetadataOutput,
        outputLooper: Looper,
        extensionRendererMode: Int,
        out: ArrayList<Renderer>
    ) {
        super.buildMetadataRenderers(context, output, outputLooper, extensionRendererMode, out)
        //Add metadata renderer to be the leading No.1 in the list
        out.add(0,
            MetadataRenderer(
                DummyMetaOutput("DummyMeta"),
                null,
                DummyMetadataDecoder.factory,
                true
            )
        )
    }
}

Bug Report

  • [ ] You will email the zip file produced by adb bugreport to [email protected] after filing this issue.

cucumbersw avatar Oct 22 '25 03:10 cucumbersw

I think you are describing two separate issues/improvements:

  1. A custom sample mime type in a video MP4 file is wrongly classified as C.TRACK_TYPE_VIDEO
  2. DefaultLoadControl had no overall limit in the size limit it can add up based on selected tracks.

For point (2), I agree it would be useful to have some kind of upper limit as a guardrail. Given there are use cases with multiple video tracks or video + image tracks, we'd have to choose a fairly conservative value to avoid breaking any intended buffer usage. I'd probably pick around ~200MB, which is about 1.5 times the default limit for muxed media containing video, audio and text.

For point (1), we should probably tighten this check to only derive the type of a TrackGroup from Format.containerMimeType if the sampleMimeType is actually empty. In the scenario you describe we set a sampleMimeType, but then completely ignore it and assume it's video.

In addition, if you intend to use this custom metadata track, it's worth registering its MIME types using MimeTypes.registerCustomMimeType so that ExoPlayer correctly classifies the track's type. This should also solve your issue because you could teach the player to treat this type as C.TRACK_TYPE_METADATA for example.

tonihei avatar Oct 23 '25 10:10 tonihei

Great! Thanks for mentioning the MimeTypes.registerCustomMimeType method.

cucumbersw avatar Oct 24 '25 08:10 cucumbersw