Section 3.2.1: Add N48 - bandwidth feedback speed
Issue #114 did not include text describing what aspect requires clarification. Since there is already an Issue open relating to Section 3.2 (#103), perhaps we can use that as the reference point.
The proposed Requirement N49 has been suggested as a problem for the Game Streaming use case previously, because players will be surprised and concerned when asked for camera and microphone permission in a game that does not make obvious use of those devices. So perhaps we should break N49 into its own PR.
The problem with support for rendering of "partial images" (e.g. slices or tiles) is that modern RTP payload specifications such as AV1 and VVC do not support SLI (Slice Loss Indicator), nor is SLI supported in any WebRTC browser. So there is no way to request recovery of a slice/tile, only the ability to negotiate support for NACK and RTX. This negotiation does not cover specific strategies like differential reliabilty (e.g. NACK only base layer), so there is no SDP semantics to work with. Is there specific functionality you have in mind?
While support for higher resolutions (e.g. 4K) and framerates is critical to game streaming, #103 points out that requirement N37 is not specific about what API changes are required. Also, N37 mentions copy removal which is more in scope for other WGs such as the MEDIA or WebGPU WGs than the WEBRTC WG. So currently PR https://github.com/w3c/webrtc-nv-use-cases/pull/117 is reformulating N37 to remove mention of copy removal and to focus specifically on concerns raised by developers about lack of control over hardware acceleration in WebRTC compared with lower level APIs like WebCodecs.
For example, a 4K high framerate game would be likely to require hardware decode to function well. Since WebRTC is a high level API, it does not provide applications with information relating to hardware acceleration issues (e.g. inability to allocate hw resources or an error that causes failover to software). So a 4K high framerate game could failover to software, making players very unhappy, and there is no event or error that would provide this information immediately. Does PR #117 address your performance concerns?
Issue #114 did not include text describing what aspect requires clarification. Since there is already an Issue open relating to Section 3.2 (#103), perhaps we can use that as the reference point.
The proposed Requirement N49 has been suggested as a problem for the Game Streaming use case previously, because players will be surprised and concerned when asked for camera and microphone permission in a game that does not make obvious use of those devices. So perhaps we should break N49 into its own PR.
@aboba Thank you for the prompt review of the PR. I agree with you merging with #103. And will split the PR for N49.
The problem with support for rendering of "partial images" (e.g. slices or tiles) is that modern RTP payload specifications such as AV1 and VVC do not support SLI (Slice Loss Indicator), nor is SLI supported in any WebRTC browser. So there is no way to request recovery of a slice/tile, only the ability to negotiate support for NACK and RTX. This negotiation does not cover specific strategies like differential reliabilty (e.g. NACK only base layer), so there is no SDP semantics to work with. Is there specific functionality you have in mind?
I believe you are referring this to N48. I will prepare more clarifications after internal discussion.
While support for higher resolutions (e.g. 4K) and framerates is critical to game streaming, https://github.com/w3c/webrtc-nv-use-cases/issues/103 points out that requirement N37 is not specific about what API changes are required. Also, N37 mentions copy removal which is more in scope for other WGs such as the MEDIA or WebGPU WGs than the WEBRTC WG. So currently PR https://github.com/w3c/webrtc-nv-use-cases/pull/117 is reformulating N37 to remove mention of copy removal and to focus specifically on concerns raised by developers about lack of control over hardware acceleration in WebRTC compared with lower level APIs like WebCodecs.
For example, a 4K high framerate game would be likely to require hardware decode to function well. Since WebRTC is a high level API, it does not provide applications with information relating to hardware acceleration issues (e.g. inability to allocate hw resources or an error that causes failover to software). So a 4K high framerate game could failover to software, making players very unhappy, and there is no event or error that would provide this information immediately. Does PR https://github.com/w3c/webrtc-nv-use-cases/pull/117 address your performance concerns?
Thank you again for the feedback and agree with overall. But I also need to discuss this item internally.
I'm in favour of this, although it describes the mechanism rather than the outcome. One extra possible knob on the jitterbuffer issue. There are situations where it is undesirable for the play out to speed up to catchup - (example is remote control of a vehicle where the perceived acceleration after a video freeze is very disconcerting.) It would be useful to be able to set a flag that minimised this effect whilst still allowing the display of the most up to date frame.
This issue was discussed in WebRTC July 2023 meeting – (PR #118: Clarify Game Streaming requirements (Section 3.2))
This issue was mentioned in WebRTC TPAC 2023 meeting – 12 September 2023 (Low Latency Streaming: Game Streaming use case)
@aboba. I think the current issue has too many items for reviewing. So I am thinking about splitting this issue to 2 or 3 separate issues. How do you think about this idea?
Some of these are not W3C requirements but IETF ones.
RPSI is already covered by https://datatracker.ietf.org/doc/html/rfc8834#section-5.1.4
Controlling rtcp feedback frequency is similar to trr-int from RFC 4585. Specifically for TWCC you would need to extend the syntax similar to how RFC 8888 does it. I do think that sprang's suggestion to just use the newer (even less standard) v2 format which lets the other side control the interval through the header extension is better.
There are some existing discussions relating to LTR/RPSI:
IETF discussion of RPSI: https://github.com/aboba/hevc-webrtc/pull/17 RPSI CL review: https://webrtc-review.googlesource.com/c/src/+/104880 (This review describes some of the problems in implementing RFC 7798-style RPSI. Not clear how SFU can determine whether a P-frame based on the LTR would be decodable by participants. Issue is not just whether the LTR was sent, but whether it was received, decoded and is currently stored in the buffer).
WebCodecs issues: https://github.com/w3c/webcodecs/issues/743 https://github.com/w3c/webcodecs/issues/285
N51 is similar to trr-int from RFC 4585. Specifically for TWCC you would need to extend the syntax similar to how RFC 8888 does it. I do think that sprang's suggestion to just use the newer (even less standard) v2 format which lets the other side control the interval through the header extension is better.
@fippo Thank you for the feedback. Assuming you are referring to N50 instead of N51.
We understand that the Transport-wide congestion control 02 provides way to configure Transport wide CC feedback. However we want to have a API to set interval for Transport wide CC feedback to resolve the following concerns from the game streaming.
- This is not reliable as sender request might get lost in network resulting in no feedback.
- Sending extra headers results into reduced payload size.
- Sender request jitter shows up in feedback receive jitter.
N49 is already covered by https://datatracker.ietf.org/doc/html/rfc8834#section-5.1.4
There are some existing discussions relating to LTR/RPSI:
IETF discussion of RPSI: aboba/hevc-webrtc#17 RPSI CL review: https://webrtc-review.googlesource.com/c/src/+/104880 (This review describes some of the problems in implementing RFC 7798-style RPSI. Not clear how SFU can determine whether a P-frame based on the LTR would be decodable by participants. Issue is not just whether the LTR was sent, but whether it was received, decoded and is currently stored in the buffer).
WebCodecs issues: w3c/webcodecs#743 w3c/webcodecs#285
@aboba, @fippo Appreciate both feedbacks. We understand the current discussion. So we want to pursue LNTF RTCP message as a short-term solution. And we will continue to pursue on the RPSI approach and find a way to meet the codec agnostic concern raised by RPSI RTCP feedback support · Issue #13.
However, not only for the transport protocol topic, we want RTP de-packetization and framing would need to be updated to recover using non-key frame by the following reasons:
- Currently RTP receiver stops providing frames to decoder on packet loss.
- Need a way to start providing subsequent completely received non-key frames to decoder.
- Requires decoder API support (only encoder API discussed at TPAC)
I'm in favour of this, although it describes the mechanism rather than the outcome. One extra possible knob on the jitterbuffer issue. There are situations where it is undesirable for the play out to speed up to catchup - (example is remote control of a vehicle where the perceived acceleration after a video freeze is very disconcerting.) It would be useful to be able to set a flag that minimised this effect whilst still allowing the display of the most up to date frame.
@steely-glint Thank you for the feedback. I have added the outcome statements on each requirements.
@aboba, @fippo I have removed N49 regarding the discussion on RPSI which we have already got the answer from the last working group discussion. Also I have added new 51 requirements for having a discussion on supporting L4S when it is being supported by the platform and network. Look forward to hearing from your feedback whether it may be appropriate for the game streaming requirements.
L4S support would involve modification to the transport-cc algorithm, but AFAICT there would be no change to the WebRTC API.
L4S support would involve modification to the transport-cc algorithm, but AFAICT there would be no change to the WebRTC API.
@aboba Thank you for the feedback. I will remove the requirement shortly.
Since the N51(Jitter Buffer Calculation Accuracy Improvement) also does not require the API changes, should we drop this as well?
Note that L4S or COPA or a similar realtime transport congestion control algorithm would be a requirement for game-streaming based on WebTransport. There are some QUIC implementations (e.g. Apple's and Meta's) that support those algorithms now.
Note that L4S or COPA or a similar realtime transport congestion control algorithm would be a requirement for game-streaming based on WebTransport. There are some QUIC implementations (e.g. Apple's and Meta's) that support those algorithms now.
Assuming we would like to support L4S through the current WebRTC protocol and would like to find a solution by using the Transport-wide CC, would it be still worth to discuss on the working group meeting?
@fippo Thank you for the feedback. Assuming you are referring to N50 instead of N51.
Edited in with actual description, thanks!
However we want to have a API to set interval for Transport wide CC feedback to resolve the following concerns from the game streaming.
This is similar to the rtx-time parameter in the SDP. Since it is SDP you'll need to convince the IETF (avtcore most likely, I think mmusic shut down?)
WebRTCs split between IETF and W3C has always been somewhat weird but SDP and RTCP semantics are clearly in IETF land.
This is similar to the rtx-time parameter in the SDP. Since it is SDP you'll need to convince the IETF (avtcore most likely, I think mmusic shut down?)
WebRTCs split between IETF and W3C has always been somewhat weird but SDP and RTCP semantics are clearly in IETF land.
@fippo Thank you for the review. Dropped the requirement N50(Transport Wide CC timing Req) from the context.
I think N50 and N51 made sense. Why could we not have a knob saying something like "short latency is very important, but consistent latency is even more so, so please consider this (i.e., do not speed up too much to catch up) in the video processing chain"?
And why can't we have a requirement on enabling quicker reacting (transport wide) congestion control (presumably enabled by frequent RTCP reports)? Exactly how it is solved (if solvable) can be discussed later, we are working with the use cases and requirements in this document.
I think N50 and N51 made sense. Why could we not have a knob saying something like "short latency is very important, but consistent latency is even more so, so please consider this (i.e., do not speed up too much to catch up) in the video processing chain"?
And why can't we have a requirement on enabling quicker reacting (transport wide) congestion control (presumably enabled by frequent RTCP reports)? Exactly how it is solved (if solvable) can be discussed later, we are working with the use cases and requirements in this document.
@stefhak Thank you for the reviewing and sharing the feedback. I restored N50 & N51 back. Hope I would get more feedback from other working group members and working group meeting tomorrow.
@xingri thank you. I created #126 in an attempt to further clarify. I don't know if that makes sense or just confuses.
@xingri thank you. I created #126 in an attempt to further clarify. I don't know if that makes sense or just confuses.
@stefhak Thank you for clarifying the statements. Hope this would help the Working Group members to understand the meaning of the PR.
@aboba , @fippo and @stefhak Unfortunately we did not have a chance to discuss this PR on today's meeting. I am willing to join a separate meeting from the December meeting or I am ok for extending the discussion time.
Lastly, first of all, we would like to get the Game Streaming Requirements to be reached to the consensus. If you think the current issue could be blocking reaching to the consensus, please recommend us to the next steps.
Just noting that I agree to @xingri - would be good to discuss and agree on gaming related requirements. I could not join yesterday (and also suspected it would not be discussed as it was the last agenda item :) ). Perhaps the December and/or the January meetings are viable options?
Just noting that I agree to @xingri - would be good to discuss and agree on gaming related requirements. I could not join yesterday (and also suspected it would not be discussed as it was the last agenda item :) ). Perhaps the December and/or the January meetings are viable options?
@stefhak We are planning to have a session on next Tuesday as in the shared slides. WEBRTCWG-2023-12-05 - Google Slides
@aboba I know we have assigned 20 mins for this discussion, but wondering if we could extend about 10mins to make it sure to be discussed at this time.
This issue was mentioned in WebRTC December 5 2023 meeting – 05 December 2023 (Game Streaming)
Updated requiements by the feedbacks frome the First December WebRTC WG Virtual Interim (12-05-2023).
@henbos, @fippo and @aboba Please review the N48 and let me know if you need further requests. @alvestrand Please review N49 and let me know whether this could meet your request from the meeting.
This LGTM now. My only comment would be that N49 could perhaps be less specific - saying something about that the (receiving) application must be able to control the feedback transmission interval (in order to be able to adapt the video quality to the varying network and maintain consistent latency) and not go into details about RTCP and TWCC. In other words, the requirement would be for an API to control feedback transmission interval, period.
With that said, I'm fine merging it "as is".
This LGTM now. My only comment would be that N49 could perhaps be less specific - saying something about that the (receiving) application must be able to control the feedback transmission interval (in order to be able to adapt the video quality to the varying network and maintain consistent latency) and not go into details about RTCP and TWCC. In other words, the requirement would be for an API to control feedback transmission interval, period.
With that said, I'm fine merging it "as is".
@stefhak Thank you for the feedback. I have updated the requirements.
@alvestrand would something along
The application must be able to control how quickly a media receiving user agent must inform the media sender
about network throughput changes (degradations? losses?) (in order to keep latency stable).
work?
Can we split the "decode after loss" and "control speed of bandwidth adaptation" into separate PRs, so that we can get discussion separately for each? They seem to have different issues.