datacue
datacue copied to clipboard
Add proposed mapping for in-band DASH emsg events to DataCue
The explainer should describe how DASH emsg values are mapped to DataCue
objects.
Consistent with the existing Webkit implementation, this could be something like:
DataCue field | emsg value |
---|---|
DOMString id |
id |
double startTime |
Computed from timescale and presentation_time_delta |
double endTime |
Computed from timescale , presentation_time_delta , event_duration |
boolean pauseOnExit |
false |
any value |
Object containing data , schemeIdUri , and value |
DOMString type |
"urn:mpeg:dash:emsg" (or similar, TBD) |
and
value field |
emsg value | Description |
---|---|---|
ArrayBuffer messageData |
message_data |
Message body (may be empty) |
DOMString schemeIdUri |
scheme_id_uri |
Identifies the message scheme. The semantics and syntax of the message data are defined by the owner of the scheme identified. The string may use URN or URL syntax |
DOMString value |
value |
Specifies the value for the event. The value space and semantics are defined by the owners of the scheme identified by the scheme_id_uri |
Does this proposal look OK?
Do we need to expose timescale
, presentation_time_delta
, and event_duration
directly to web applications, rather than only using them to compute the cue's startTime
and endTime
?
For example:
- MPD validity expiration events (DASH spec section 5.10.4.2) gives particular meaning to the case where
presentation_time_delta
andevent_duration
are both zero - SCTE 214-3 specifies an
event_duration
value of 0xFFFF as "unknown". How should this be interpreted?
What are the rules for placing the event on the media element's media timeline?
Through our discussions in the DASH-IF group we've aligned on the idea that the player utilizes the emsg
data to compute the properties presentation_time
and duration
which are sourced differently depending on the emsg
version used. This means we wouldn't have a burden of exposing the raw emsg
values through the DataCue
objects.
Technically the event specification asks to expose presentation_time
and duration
directly, but I'd argue startTime
and endTime
are sufficient if you can capture the unknown duration signalling case properly. (I think you are capturing that in https://github.com/WICG/datacue/issues/18)
The MPD expiration event signalling the presentation end with the dual zero values is an interesting case, I'm not sure we even discussed it in the DASH-IF Event TF, I'll take that back for further discussion: https://github.com/Dash-Industry-Forum/Events/issues/66
On the proposal specifics, generally would it be correct to interpret that the current expectation is all emsg
would be a single Track exposed for consumption? With this lens the top level attributes and value structure proposed look good.
However, if I were to map back our thoughts from the DASH-IF Event side, we generally treat the scheme_id_uri
and value
pairs as a standalone event track with the emsg
events as a method that can carry multiple tracks. In the context of the UA parsing messages this could be tricky because you might be adding and remove Tracks constantly as emsg
boxes with schemes are added and removed from the media buffer. Conceptually we've seen the model as the application forward registering for schemes that it understands, that way the player only needs to track and surface matching events.
Could I propose a similar pattern here to make emsg
mapping simpler? Assuming a Type 3 playback context with the UA parsing the boxes and not the application level library. The expectation could be that the library creates the Track objects it expects to be filled from inband sources on the media element, identifying the Track with the scheme_id_uri
and value
pair it is looking for. As the UA discovers emsg
boxes, it checks if a Track matching the scheme_id_uri
and value
pair has been created, if so it creates the DataCue
as specified above except cue.value = message_data
instead of the value structure. In this situation the application would understand the need to create a track based on the presence of an InbandEventStream
element within an AdaptationSet
in the MPD
.
would it be correct to interpret that the current expectation is all emsg would be a single Track exposed for consumption?
That's the current thinking, yes. But this does leave a gap, as compared to DASH-IF Events model, as DataCue has not defined a mechanism to let the web app tell the browser which event streams to surface (by providing the scheme_id_uri
and value
), and so the assumption would be that the browser surfaces all emsg
boxes parsed from the media, and the web app is responsible for ignoring any that it doesn't want to handle. Is that reasonable, or do we need to provide such an API?