mediacapture-transform
mediacapture-transform copied to clipboard
What is the impact of timestamp for video frames enqueued in VideoTrackGenerator?
VideoTrackGenerator enqueues VideoFrame objects which have a timestamp. As per existing WPT tests, timestamps are expected to be preserved when doing VideoTrackGenerator->MediaStreamTrackProcessor. This would mean that a web application is free to enqueue video frames with non increasing timestamps, say 1, 2, 4 and then back to 1 and 3.
Another approach would be for VideoTrackGenerator to consider that timestamp of a frame is the time it gets enqueued.
This relates to the somehow bigger issue in https://github.com/w3c/mediacapture-transform/issues/88.
I think out of order behaviour is probably necessary in order to support the full range of NACK/RTX functionality in javascript.
I did an experiment on Chrome, capturing a track from a camera, encoding VideoFrames in WebCodecs and serializing the EncodedVideoChunk and timestamp on the wire using WebTransport. On the receiver the EncodedVideoChunk is decoded, timestamp is set, and MSTG is used to generate a track with RVFC used to retrieve metrics such as captureTimestamp.
RVFC spec says that captureTimestamp is only set on locally captured tracks but it was present on the remote track. This leads me to believe that in Chrome timestamp is not actually a "presentation timestamp" as defined in WebCodecs, but rather a capture timestamp.
VideoTrackGenerator should probably not care about VideoFrame.timestamp.
Its processing of frames is to Send clone to track, which is bit vague.
My interpretation is that it submits the frame to each one of the track's sink immediately, without any buffering.
timestamp has therefore no impact on what VideoTrackGenerator does.
Maybe we should clarify this in the spec.
It is then up to each sink to determine what to do with the VideoFrame timestamp.
RVFC is stating how to populate captureTime based on the track type:
For video frames coming from a local source, this is the time at which the frame was captured by the camera. For video frames coming from remote source, the capture time is based on the RTP timestamp of the frame and estimated using clock synchronization.
Given track->MSTP->pipeTo->VTG->RVFC should be the same as track->RVFC, we could have the following rules:
- RFVC is setting the capture time as the value of
VideoFrame.timestamp. VideoFrame.timestampfor camera/screencapture tracks is the capture time as defined in RVFCVideoFrame.timestampfor webrtc remote tracks is the capture time as defined in RVFCVideoFrame.timestampis preserved as is by VideoTrackGeneratorVideoFrame.timestampfor canvas track is the time at which the canvas is snapshotted (similar to camera tracks somehow)- Each other sink should describe how it is using
VideoFrame.timestamp, meaning HTMLVideoElement, RTCRtpSender and MediaRecorder. By default, it does nothing with it.
I think 1, 2, 3 and 4 are aligned with @aboba Chrome's testing but this does not seem described anywhere. Ideally, each spec defining a video source would define how VideoFrame objects exposed via MSTP are generated, including the timestamp, metadata and so on.
Looking at the sink side in Safari, VideoFrame.timestamp is not used much:
- HTMLVideoElement will render the frame ASAP.
- RTCRtpSender and MediaRecorder are roughly using the time at which is submitted the frame as the actual timestamp. I wonder what Chrome and Firefox are doing for those sinks. @alvestrand, @jan-ivar, do you know?
Chrome's internal VideoFrames have their timestamp directly sourced from VideoFrame.timestamp, but also an optional absolute capture time which is set from some sources.
Both RtpSender and MediaRecorder prefers to use the capture time if present, otherwise
- RtpSender will use an estimation of the capture time based on frame submit time and
timestampcode ref. - MediaRecorder will use roughly the submit time and ignore timestamp code ref
I think HTMLVideoElement uses the internal timestamp to smooth out playback, @drkron can you shed some light?