webrtc-extensions icon indicating copy to clipboard operation
webrtc-extensions copied to clipboard

Add API to control encode complexity

Open ssilkin opened this issue 1 year ago • 16 comments

Background

Encode complexity settings are hardcoded for WebRTC's built-in encoders (libvpx VP8/VP9, libaom AV1 and OpenH264). The settings depend on platform, number of CPU cores and video resolution and are optimized to provide acceptable performance on a wide range of devices. In some scenarios these default settings are suboptimal. Access to encode complexity settings would allow applications to optimize the trade-off between device resource usage and compression efficiency for their use cases. For example, a higher encode complexity mode can be used to achieve better video quality and/or to reduce video bitrate.

Proposed API

Add encodeComplexityMode to RTCRtpEncodingParameters:

enum RTCEncodeComplexityMode {
  "low",
  "normal",
  "high"
};

partial dictionary RTCRtpEncodingParameters {
  RTCEncodeComplexityMode encodeComplexityMode = "normal";
};

encodeComplexityMode specifies the encoding complexity mode. "normal" is the default mode that provides a balance between device resource usage and compression efficiency suitable for most use cases. Relative to "normal" mode:

  • "low" mode results in lower device resource usage and worse compression efficiency
  • "high" mode results in higher device resource usage and better compression efficiency

The user agent SHOULD configure the encoder according to the encoding complexity mode specified. Changes in encoding performance are codec specific and are not guaranteed.

Details

A hardcoded mapping will be used to convert complexity mode to encoding settings (CPUUSED in the case of SW libaom/libvpx encoders, KEY_COMPLEXITY in the case of Android MediaCodec, etc).

Relative differences in encoding performance between different encode complexity modes are not fixed and may change in the new binary due to changes in underlying encoders and/or compilation settings.

ssilkin avatar Dec 11 '23 13:12 ssilkin

@aboba Can we add this to the grab bag in the January interim please?

Orphis avatar Dec 14 '23 10:12 Orphis

This also maps nicely to the Opus "complexity" (0-10 OPUS_SET_COMPLEXITY(x) / OPUS_SET_COMPLEXITY_REQUEST) concept where the "normal" chosen by most applications is a value of 9, with 5 being the "low" default on mobile platforms.

fippo avatar Jan 16 '24 17:01 fippo

From the discussion, it seems we want to do two things:

  1. Be able to say this stream that we encode is less important than this other one. This would allow UA to fine tune their degradation heuristics.
  2. Tell the UA that we prefer high quality, or battery life. And then UA might want to tweak its encoder.

For 1, we already have https://w3c.github.io/webrtc-priority/#rtc-priority-type so maybe we do not need anything? For 2, I wonder whether this should not be a global setting at the scope of the peer connection. Maybe a preference with values like "quality" and "powerEfficiency" (plus "") would be good enough.

Note also, that, if CPU is what can be controlled, this could also be applied to the receiving side, which could reduce somehow its complexity by changing the rendering (longer audio chunks, dropping frame rate to 30fps...).

youennf avatar Jan 16 '24 18:01 youennf

While it could be used as a way to prioritize resources when you have multiple streams, it's not just that as you may decide that a single stream still needs the setting.

And applying it to the whole page doesn't work either as it may need to be dynamic during the application lifetime and I don't see how a global setting will be an effective API for that.

Another use case is when the application detects a bad network quality and that resources are available, it may ask for more resources to be spent on encoding some important media stream. While the user-agent can try to do some of that automatically, it is unreasonable to expect it to go past some threshold that could be negatively impacting some metrics as it's usually a trade-off.

The application may decide that in some circumstances, this is ok and this would be the setting to allow that. But you need to have local knowledge of the application and usage in order to turn it in and I believe a user-agent to be too high level to be able to infer it all.

Also, FYI, I'd expect this setting to be mapped to VideoToolbox's PrioritizeEncodingSpeedOverQuality setting and maybe QP settings like MinAllowedFrameQP and MaxAllowedFrameQP.

Orphis avatar Jan 17 '24 10:01 Orphis

QP settings like MinAllowedFrameQP and MaxAllowedFrameQP.

These settings will have an impact on bitrate, I am not sure it will have any effect on CPU.

PrioritizeEncodingSpeedOverQuality setting

I am not sure this one kicks in in the low latency code path. Overall, when hardware encoders are in use, CPU usage will probably not vary much. I do not know how is mapped KEY_COMPLEXITY to HW realtime android encoders and what the impact is. The OP is mentioning SW encoders, is this the target?

the application detects a bad network quality and that resources are available

I am not exactly sure to understand how the application is supposed to know that resources are available and I am questioning how we can get interop between UAs. For instance, CPU adaptation could be done either with the encoder complexity mode or with media adaptation, ditto for bitrate.

The following might somehow work for video:

  • web site sets encoding complexity, UA translates it to a fixed value.
  • When UA hits bitrate or CPU contention, UA is not expected to change video encoder's complexity mode. Instead it relies on the current degradation preference by either reducing frame rate or resolution.
  • web site may further tweak encoding complexity based on observed sent frame rate / resolution / bitrate.

Is this how this is supposed to be implemented and used?

With regards to audio, it cannot really do media adaptation, I am unclear how it would know CPU adaptation is needed.

youennf avatar Jan 18 '24 08:01 youennf

@aboba , Bernard, would it be possible to reserve a time slot to discuss this API at the next WebRTC WG meeting?

ssilkin avatar Sep 17 '24 13:09 ssilkin

@ssilkin If you are attending TPAC 2024, perhaps we can add discussion of this issue to Erik Sprang's slot in the joint MEDIA/WEBRTC WG meeting. If he agrees, you can add the slide(s) here.

aboba avatar Sep 17 '24 15:09 aboba

I can talk about this issue. I took the freedom to move it from the join webrtc/media wg session to the webrtc wg session on the 24th (taking some minutes from the stats I'll be talking about there). Hope that's ok!

sprangerik avatar Sep 19 '24 13:09 sprangerik

No problem.

aboba avatar Sep 19 '24 15:09 aboba

Thanks, @sprangerik!

ssilkin avatar Sep 23 '24 07:09 ssilkin

This didn't get discussed at TPAC, due a power outage cutting the session short. There's a meeting tomorrow though, shall we bring it up then instead?

sprangerik avatar Oct 14 '24 10:10 sprangerik

I think there will be time. Ask Harald if he can add it to his slot.
Slides are here.

aboba avatar Oct 14 '24 22:10 aboba

I don't seem to have access to update this issue, but I'm thinking @ssilkin would be the better owner for this bug.

sprangerik avatar Dec 13 '24 09:12 sprangerik