obs-studio libobs: Add Broadcast Performance Metrics (BPM)

Description

This PR implements initial support for Broadcast Performance Metrics (BPM). BPM currently sends the following metrics inband with the video bitstream using SEI (for AVC/HEVC) and OBU (for AV1) with each IDR-frame, which is typically at 2 second intervals:

UTC-based wall-clock timestamps using RFC3339 format
Session frame counters: rendered, lagged, dropped, output
Encoded rendition frame counters: input, skipped, output

The metrics are intended to be used by live-streaming services such as Twitch and Amazon IVS, for example in plugins such as multitrack/eRTMP. BPM generation and delivery is enabled for the output by registering the bpm_injection() callback with obs_output_add_packet_callback(). This packet callback mechanism is introduced with this PR after the initial implementation (within libobs) was removed in favour of a callback API at the packet processing level.

Detailed BPM documentation is available in the Multitrack Video Integration Guide.

Motivation and Context

To enable ongoing improvement of automatic stream configuration, to deliver the best possible stream settings, broadcast performance metrics (BPM) must be measured and sent. The metrics are collected and sent in-band via either SEI (for AVC/HEVC) or OBU (AV1) messages. Two classes of data are collected:

Timestamps are collected to measure end-to-end latency between the broadcaster and the viewer. They are useful for:
- Providing the broadcaster or audience with an estimate of end-to-end latency
- Analyzing timestamp jitter that may indicate system stress or poor first-mile network connectivity
- Referencing real-world event time for aligning and aggregating time-series counter data
Frame counters are collected to measure the performance of the broadcast software and video encoders at the frame level. They are useful for:
- Providing broadcasters with a performance dashboard that includes additional signals, to help them improve their streaming setup
- Providing a proactive signal that may correlate with environmental changes like newly released GPU drivers or OS versions/patches
- Providing feedback to enable video services to safely iterate and release improvements to GetClientConfiguration, including support for new hardware vendors, new GPU models, new codecs, new driver features, additional video-encoder setting tuning, and new user-controlled presets (e.g., “Dual PC Setup” vs. “Gaming+Streaming Setup”).

The timestamp sent from the broadcaster is based on a global common reference clock, typically an NTP-synchronized clock using the UTC+0 timezone. RFC3339 is commonly used for this scenario of "Internet time." This provides an absolute reference, making temporal difference calculations trivial. The timestamps have millisecond resolution which is necessary to provide resolution suitable for measuring end-to-end latency, and to provide enough resolution for frame-level timestamps.

The events being timestamped for video frames are:

CTS (Composition timestamp): When the composited frame was rendered by OBS
FER (Frame encode request timestamp): When the frame was requested to be encoded
FERC (Frame encode request complete timestamp): When the frame encode request was completed
PIR (Packet interleave request): When the compressed frame (packet) was sent to the output plugin.

The time difference between PIR and CTS represents the majority of the client latency.

How Has This Been Tested?

BPM was tested in the Twitch Enhanced Broadcasting beta, streaming multi-rendition and single-track sessions into the Twitch ingest service, parsing the metrics, and sending the results to the control plane back-end. Additionally, it was tested with AVC, HEVC, and AV1, with and without closed captions injections. BPM and closed captions use a very similar mechanism of injecting data using SEI/OBU hence the need to test both alone and simultaneously. Multiple bitstreams were also captured and verified with a stream analyzer for correctness.

Types of changes

New feature (non-breaking change which adds functionality)

Checklist:

[x] My code has been run through clang-format.
[x] I have read the contributing document.
[x] My code is not on the master branch.
[x] The code has been tested.
[x] All commit messages are properly formatted and commits squashed where appropriate.
[x] I have included updates to all appropriate documentation.

Jul 18 '24 19:07 lexano-ivs

I'm curious, does/will this information get passed through to the client side (of twitch, at least)? Because that would be really useful.

(For bonus points, make it so the timestamps represent the time the associated frame was created, rather than when it was output from the encoder, which would not only make them even more useful, but also measure encoder delay as part of the end-to-end measurement. At least, I don't think that's what it's doing right now)

Jul 18 '24 22:07 alinsavix

I'm curious, does/will this information get passed through to the client side (of twitch, at least)? Because that would be really useful.

Short answer: yes. Long answer, a few points!

Yes, the embedded SEIs do get passed through transmuxing to the delivered video segments on Twitch.
However, we haven't decided exactly if/how we will intentionally expose it to either broadcasters or viewers; we are considering ideas like putting it in Twitch Inspector or in "Video Stats for Nerds" in the player. First, we have to validate that the data pipeline works at scale and that we trust the data to expose it to end users.
In the interim, this data is absolutely crucial for us to continue iterating on Automatic Stream Configuration settings (e.g. deploying HEVC or 1440p/4k ladders.) Without it, we can't safely iterate and change client settings (because we don't want to break users' performance.)

Jul 19 '24 18:07 koolscooby

The title of the commit fea442ebd should start from obs-webrtc: instead of WHIP: .

Jul 20 '24 14:07 norihiro

A few things:

I would highly prefer that we include a millisecond-accurate signed 64 bit integer UNIX epoch timestamp, instead of an RFC3339 string.
Would it be possible to split out BPM from libobs, and instead add functionality to add private SEI payloads for encoders? Something along the lines of obs_encoder_add_private_sei(obs_encoder_t *, void *buf, size_t size). The location of the BPM code could probably be in the UI, running off of a QTimer to trigger.

Jul 20 '24 23:07 tt2468

The title of the commit fea442e should start from obs-webrtc: instead of WHIP: .

Thank you @norihiro. I've updated the commit with the correct name.

Jul 22 '24 21:07 lexano-ivs

A few things:

* I would highly prefer that we include a millisecond-accurate signed 64 bit integer UNIX epoch timestamp, instead of an RFC3339 string.

In the integration guide, on page 28, we defined a type system for timestamp data. timestamp_type==2 is the duration_since_epoch_ts, as well as a delta_ts format to signal differences. At the moment we're using type 1, which is the RFC3339 format, mainly because it was the best way for us to get started in the beta testing because the ingest servers were already speaking RFC3339. We can add type 2 and type 3 formats in the future, and any parser would have to respect them. Adding type 2 and/or type 3 (or any other types we come up with) is on the backlog, just simply not scoped/planned right at the moment.

* Would it be possible to split out BPM from libobs, and instead add functionality to add private SEI payloads for encoders? Something along the lines of `obs_encoder_add_private_sei(obs_encoder_t *, void *buf, size_t size)`. The location of the BPM code could probably be in the UI, running off of a QTimer to trigger.

I need to think about this a bit more. Initial thought revolves around trigger. The current implementation is triggered from the send_interleaved() function because it was the best spot I could find to identify IDR frames to align with. We need the data aligned to IDR frames, so moving the trigger to an async QTimer seems like it might introduce jitter between the IDR time and the QTimer firing, with the assumption the we want tight (preferably synchronous) alignment.

Jul 22 '24 21:07 lexano-ivs

I would like to suggest that we implement a packet callback for outputs. This callback would be triggered for every video packet, closely before being passed to the output to be transmitted. The BPM SEI rendering code can be moved to the common folder in the base of this repository, similar to happy eyeballs and media-playback.

Here some suggested APIs:
* Typedefs
  
  * `struct array_output_data *(*obs_output_packet_callback_t)(obs_output_t *, struct encoder_packet *pkt, struct encoder_packet_timing *timing, void *)`

* Functions
  
  * `obs_output_add_packet_callback(obs_output_t *output, obs_packet_callback_t callback, void *data);
  * `obs_output_add_packet_remove(obs_output_t *output, obs_packet_callback_t callback, void *data);`
It is important to note that this callback is implementer per-output, so the primary buffer in the packet is going to likely need to be duplicated before libobs inserts the SEI payload, to prevent SEI data from being sent to other outputs subscribed to the same encoders.

Thanks for the suggestions. The "BPM SEI rendering code" can be thought of as 2 parts: 1. The calculations of the frame counters and timing deltas (once we implement delta time); and 2. The rendering of this data into SEI payload syntax. Are you envisioning moving both parts 1 and 2 into the callback, or just part 2? I'm leaning towards both parts should be handled in the callback function, but wanted to understand what you were thinking as well.

Also, if we employ this approach, the closed caption SEI could be supported with the same callback technique. I'd like to keep the migration of the current closed caption code to the potentially new callback mechanism as a separate workstream, and focus purely on the BPM support for this PR. Is this OK with you?

I'd also like to understand if this packet callback should be available for use by any plugin, or would this be only for "internal" plugins (for lack of a better term) compiled natively with OBS? I believe it's the former (any plugin could use it), but would like to know for sure.

Aug 01 '24 20:08 lexano-ivs

With the latest updates, frame timing is now propagated from the encoder array to each output array. This allows the frame timing to work in with multiple output services in tandem. The initial implementation would drain the encoder frame timing array after the first consumption of the data.

I'm moving the PR to "Ready for review", and will work on renaming bpm_frame_time to encoder_packet_timing (or something along these lines).

Aug 09 '24 14:08 lexano-ivs

Thanks for the suggestions. The "BPM SEI rendering code" can be thought of as 2 parts: 1. The calculations of the frame counters and timing deltas (once we implement delta time); and 2. The rendering of this data into SEI payload syntax. Are you envisioning moving both parts 1 and 2 into the callback, or just part 2? I'm leaning towards both parts should be handled in the callback function, but wanted to understand what you were thinking as well.

Uhh I think both parts would be handled in the callback? I don't know if the frame counters and timing deltas are before or after the encoder_packet_timing struct finalization.

Also, if we employ this approach, the closed caption SEI could be supported with the same callback technique. I'd like to keep the migration of the current closed caption code to the potentially new callback mechanism as a separate workstream, and focus purely on the BPM support for this PR. Is this OK with you?

It is definitely not necessary to do any immediate migration of the captions code to this callback, just as long as it's kept in mind.

I'd also like to understand if this packet callback should be available for use by any plugin, or would this be only for "internal" plugins (for lack of a better term) compiled natively with OBS? I believe it's the former (any plugin could use it), but would like to know for sure.

As with other libobs APIs of similar design, this would be available to any code which has a copy of the encoder's pointer. That is usually the same code that created the encoder, but there are places where a third party plugin could fetch an encoder it is not managing, like through the frontend API. Realistically, if you register a callback of your own on this API, you would be expected to disconnect the callback as a part of your own cleanup procedure, or it would be disconnected by libobs if the encoder is destroyed.

Aug 10 '24 10:08 tt2468

@tt2468 @RytoEX I've pushed a series of commits earlier today that should satisfy the requested changes:

Renamed bpm_frame_time to encoder_packet_time
encoder_packet_time handling remains within libobs and flows from encoding functions (stored in the encoder_t packet_times array) to the output_t packet_times arrays.
BPM code has been completely removed from libobs, and moved to deps/bpm
The enable_bpm flag has been removed, along with the public API.
A packet callback mechanism has been created. Registered packet callbacks are invoked from a loop in send_interleaved().
BPM is the first user of the new packet callback mechanism. bpm_injection() is introduced at the callback function as well as bpm_destroy() to deallocate. bpm.h is the public interface to BPM and simply contains the 2 function signatures.
BPM is enabled in multitrack video by using the new packet callback add/remove functions.
I've disabled BPM in WebRTC for the moment as it needs more testing with the new mechanism.

We can squash the new commits into old ones, and/or rework the branch as needed. I'd like to get an initial review first as I don't want to rewrite the branch history until the new code seems OK. I'll definitely keep a local copy of the original branch though for safekeeping before squashing anything.

Aug 21 '24 17:08 lexano-ivs

@tt2468 I've rebased against master as of this morning, fixed conflicts, moved from deps/bpm to shared/bpm, and added a commit with documentation for the add/remove packet callback functions. This should be close to complete now, and if it looks OK, I can squash commits to make it prettier.

Regarding memory leaks, I have been running with debug mode and checking for the num of memory leaks, which is 0, so I'm confident there's no issue there.

I don't understand why the "gersemi" validation is complaining about the format of the CMakeLists.txt files and could use some guidance on what might be wrong.

Aug 23 '24 18:08 lexano-ivs

I don't understand why the "gersemi" validation is complaining about the format of the CMakeLists.txt files and could use some guidance on what might be wrong.

Gersemi si the new cmake formatter being used in CI, you should change from using cmake-format to https://github.com/BlankSpruce/gersemi to keep your files formatted.

Aug 23 '24 19:08 kkartaltepe

@tt2468 I've squashed the commits now. Much cleaner. Please review and let me know next steps, if any.

Aug 27 '24 16:08 lexano-ivs