datacue icon indicating copy to clipboard operation
datacue copied to clipboard

Add support for SEI

Open leonardoFu opened this issue 2 years ago • 6 comments

SEI is useful when we want to render content synchronously with video element, for example: subtitles, trigger for event or animation, body shape mask and so on (you can visit this explainer for full usecases) but:

  1. For file based video playback(MP4 or HLS on ios Safari), there is no way to get SEI from video stream, because video doesn't dispatch sei by event.
  2. If we use MSE to play video, it's complicated if we want to parse the AVC stream to get the SEI info, and sync the SEI with video frame.

So, can we have SEI as a supported DataCue source, then we can get SEI information from video element by DataCue API.

const video = document.getElementById('video');

video.textTracks.addEventListener('addtrack', (event) => {
  const textTrack = event.track;

  if (textTrack.kind === 'metadata') {
    textTrack.mode = 'hidden';

    // See cueChangeHandler examples below
    textTrack.addEventListener('cuechange', cueChangeHandler);
  }
});

const cueChangeHandler = (event) => {
  const metadataTrack = event.target;
  const activeCues = metadataTrack.activeCues;

  for (let i = 0; i < activeCues.length; i++) {
    const cue = activeCues[i];
    // each cue is a SEI data
  }
};

leonardoFu avatar Jun 01 '22 16:06 leonardoFu

Your example should work in Safari on macOS and iOS today. Does it not work?

eric-carlson avatar Jun 01 '22 18:06 eric-carlson

Your example should work in Safari on macOS and iOS today. Does it not work?

for Datacue I think yes, but SEI data is not supported?

leonardoFu avatar Jun 06 '22 03:06 leonardoFu

If SEI events have zero duration, and are instead associated with individual video frames, you should prefer to use DataCue enter and exit events rather than cuechange events, as the activeCues list in the cuechange handler isn't guaranteed to include the cue (see https://www.w3.org/TR/media-timed-events/#using-cues-to-track-progress-on-the-media-timeline).

chrisn avatar Jun 06 '22 14:06 chrisn

Note: SEI (Supplemental Enhancement Information) is described in H.264 Annex D. The spec describes a number of predefined messages (see payloadType), which could be used by the video decoder or renderer. Section D.1.7 describes payloadType 5 as "User data unregistered SEI message syntax". Which payload types would you expect to expose to web apps via DataCue?

chrisn avatar Jun 06 '22 15:06 chrisn

I have some more questions:

If SEI information is associated with individual video frames, how far ahead of the current playback position should SEI events be surfaced, to give the web application enough time to act on the information?

Also, do you expect that the web application can update the browser DOM in response to SEI information in a way that is frame accurate and synchronized to the video?

chrisn avatar Jun 10 '22 10:06 chrisn

I think, how far ahead of the current playback position should SEI events be surfaced depends on the video buffer, think about the pipeline:

network -> demux(get SEI) -> buffer -> decode -> render;

it may cost at least 2s(low latency HLS) for a frame to get in and out of the video buffer. So, I think it's enough for web apps to solve the SEI information.

As to the synchronization, I think use requestVideoFrameCallback is a good way to sync the SEI rendering with the accurate video frame. Also, if we want to control the render pipeline, just use the WebCodecs.

leonardoFu avatar Jun 16 '22 15:06 leonardoFu