qlog
qlog copied to clipboard
Aggregating stream_frame
I am currently implementing qlog for video streaming using quiche.
Due to the nature of video streaming, there are a lot of back and forth of STREAM
and ACK
frames, and they are not very useful to log.
From the server
vantage point, ACK
s can be aggregated by either packets_acked
or frames_processed
event. However, I did not find a way to aggregate STREAM
frames the server send out. Is there a good way to solve this problem?
Hello @incfex,
Thanks for opening an issue.
IIUC, you're basically asking for a frames_created
event, in which you would aggregate ACK and STREAM frames that are sent over multiple packets into a single event. Is that correct? This is indeed something we currently don't have, but that I feel would be logical to add.
For the ACK-frame special case, maybe we can add an endpoint discriminator to packets_acked
so it can be re-used when either sending/receiving (though this kind of depends on how uniformly we end up doing qlog events down the line: it might make more sense to add that to frames_processed
too and rename it to frames_info
or something).
example (though not very happy with the semantics of "owner" here):
packets_acked {
owner?:"local"|"remote",
packet_number_space?:PacketNumberSpace,
packet_numbers?:Array<uint64>
}
I am however a bit confused by your motivation saying
there are a lot of back and forth of STREAM and ACK frames, and they are not very useful to log.
aren't you still logging them all in the frames_processed
(or potential frames_created
)? or how does your approach make that specifically better than using packet_sent
and packet_received
? Do you mean you don't want to log the packet info and just the frame details? or are you logging select ACKs/STREAM frames (but how do you decide which to log then?)? or is there something else going on?
Sorry for the confusion, my current situation is: on the SERVER side, it is sending a lot of STREAM frames and receiving similar amount of ACK frames.
IIUC, you're basically asking for a frames_created event, in which you would aggregate ACK and STREAM frames that are sent over multiple packets into a single event.
This is basically what I am asking for, but only log ACK or STREAM frames, and with a little twist.
Qlog currently requires STREAM frame to have frame_type
and stream_id
field. In my case when I am logging every packet_sent
, each of packet_sent
only have a single stream frame in it. There are usually 50 consecutive packet_sent
events like this. Thus, I am purposing a frames_created
event with default_frame
field inside. When frames in the array did not specify their frame_type
, we could assume that they have the same field values with the default.
purposed frames_created
example:
frames_created {
default_frame?:QuicFrame,
frames?:Array<QuicFrame>,
packet_numbers?:Array<uint64>
}
current example of logging all the STREAM frames with packet_sent
(json format):
{
"events": [
{
"time": "1",
"name": "transport:packet_sent",
"frames": [
{
"fin": false,
"frame_type": "stream",
"length": 1000,
"offset": 2000,
"stream_id": 0
}
],
"header": {
"packet_number": 3,
"packet_size": 1024
},
"packet_type": "1RTT",
"transmission_type": "not_retransmission"
},
{
"time": "100",
"name": "transport:packet_sent",
"frames": [
{
"fin": false,
"frame_type": "stream",
"length": 1000,
"offset": 3000,
"stream_id": 0
}
],
"header": {
"packet_number": 4,
"packet_size": 1024
},
"packet_type": "1RTT",
"transmission_type": "not_retransmission"
},
{
"time": "200",
"name": "transport:packet_sent",
"frames": [
{
"fin": false,
"frame_type": "stream",
"length": 500,
"offset": 4000,
"stream_id": 0
}
],
"header": {
"packet_number": 5,
"packet_size": 524
},
"packet_type": "1RTT",
"transmission_type": "not_retransmission"
}
]
},
...
Example of logging with the frames_created
event:
{
"events":[
{
"time":"1",
"name":"transport:packet_sent",
"data":{
"default_frame":{
"frame_type":"stream",
"stream_id":0,
"length": 1000
},
"frames":[
{"offset": 2000},
{"offset": 3000},
{"offset": 4000, "length": 500}],
"packet_number": [3, 4, 5]
}
}
]
}
This approach could also applies to ACK frames, becomes a dumb version of packets_acked
event that does not care repeated ACKs.
What do you think of this?
Thanks for the additional information, that's much clearer now.
While I agree with the use case for a frames_created
event, I'm quite skeptical about the proposed design, primarily the default_frame
setup. I'm having a hard time imagining many implementers wanting to use it like that in a general-purpose setup.
For example, you'd have to check if per-frame values deviate from the default, and you'd have to keep multiple of these events "alive"/buffered, 1 for each frame type you might care about (and demux frame data to the correct one). It would also require some quite more advanced logic in tools. I feel that would work in your specialized use case (only STREAM/ACK, many similar frame instances), but IMO that's not what should be standardized: this would be better suited for a "custom", application-specific qlog event, which is perfectly allowed by the spec as well.
This is partly because IIUC, one of the main reasons to do this is to reduce log verbosity. qlog has a long history of trying different ways to combat this (e.g., a convoluted csv-alike setup in draft-01 with columns). For draft-02, we took the decision to care less about (textual) verbosity/repetition, since this can be dealt with using either compression or a binary serialization format (see #30 for some earlier discussion and #144 for the current issue on this). I think moving from pure packet_sent
to a general frames_created
(where you still log each frame (in-full), without default_frame
) already helps a lot in this regard. Going further should imo be application-specific.
So what I'd propose is add something like this:
frames_created :
{
frames:Array<QuicFrame>,
packet_numbers?:Array<uint64>
}
I think probably @marten-seemann also has some opinions on this.
Thanks for your response!
I agree that this would advanced logic in tools to achieve default_frame
. My original intend is to only apply this field to STREAM/ACK frames, since they are the most likely ones to appear consecutivly. Also only maintain/buffer 1 frames_created
for both frame types. As long as the next frame is different from the default_frame
maintained, current frames_created
will be concluded and a new one will be created. This way, we do not have to buffer multiple of these events.
I'm having a hard time imagining many implementers wanting to use it like that in a general-purpose setup.
About this, if a website have video playback or file download, it will have bulk consecutive STREAM/ACK frames. Having a video on the website is not that uncommon this days.
Maybe change the default_frame
to common_frame_type
, and only declare the frame_type:string
inside it would reduce the amount of advance logic, whilst saving space to store the qlog.
After applying frames_created
and frames_processed
to only consecutive STREAM and ACK frames, I have observed 70~80% space save in the final qlog. These event types prove to be very useful in the situation where large objects are transferred (e.g. streaming, downloading).
Hello @incfex,
Thanks for keeping us updated on this.
IIUC, you have now created a custom frames_created
event that makes use of your proposed default_frame
field, right? And that approach saves you 70 to 80% of qlog size. Would it be possible to upload an example of such a resulting file here so I can better interpret what exactly you're doing?
During the recent discussion on this at the IETF, I proposed not adding events like frames_created
to qlog to keep the qlog spec more straightforward. In order to reduce qlog size, I argued that 1) it doesn't matter -that- much in practice (some large deployments like Facebook use verbose JSON directly) and 2) it can largely be solved using (on-the-fly/streaming) compression (see also #30).
As such, it would help me to get some insight into your specific use case and the reasons why keeping qlog size down (if that is indeed the main motivation) is crucial to your setup and why that's difficult to achieve by using compression instead. I'm not arguing having additional size optimizations would be useless, but I am debating if the complexity of adding these events to the qlog standard are worth their benefits.
Relatedly, I wonder how you are using the logged information concretely. No longer having specific timestamps and packet associations for individual ACKs/STREAM frames reduces debuggability in some ways. How are you currently processing the qlogs in practice to e.g., find bugs/inefficiencies or how do you plan to do so?
Thank you in advance for this additional insight!
IIUC, you have now created a custom frames_created event that makes use of your proposed default_frame field, right? And that approach saves you 70 to 80% of qlog size. Would it be possible to upload an example of such a resulting file here so I can better interpret what exactly you're doing?
Absolutely, here is the qlog format we are currently using. f576620f4936d820.qlog.zip This is a fork to the Facebook's mvfst implementation, and still a work in progress. Since we are only logging on the server-side, it is only aggregating outgoing STREAM and incoming ACK frames. You might be noticing repeated "packetNums" for ACK, that is for the PADDING frame attached to the ACK frames.
During the recent discussion on this at the IETF, I proposed not adding events like frames_created to qlog to keep the qlog spec more straightforward. In order to reduce qlog size, I argued that 1) it doesn't matter -that- much in practice (some large deployments like Facebook use verbose JSON directly) and 2) it can largely be solved using (on-the-fly/streaming) compression (see also #30).
I have watched that session and learnt a lot from that. I now agrees with that if we are trying to make qlog a logging format, sticking to the wire-format is the way. However, writing qlog is hammering on our disk IO, increasing the IO latency and thus lowered our bandwidth. Also, our QUIC server works in edge computing, the disk space to store and bandwidth to transfer qlog cost a lot, so we cannot use verbose JSON directly. (on-the-fly/streaming) does help, and we are using it now to further reducing the size of our qlog.
As such, it would help me to get some insight into your specific use case and the reasons why keeping qlog size down (if that is indeed the main motivation) is crucial to your setup and why that's difficult to achieve by using compression instead. I'm not arguing having additional size optimizations would be useless, but I am debating if the complexity of adding these events to the qlog standard are worth their benefits.
Reasons as explained before:
- high cost to store and transfer the qlog.
- writing un-aggregated qlog impacts our disk IO, caused lower network bandwidth.
- compression is used with aggregated qlog.
Our current observation is that if you try to implement qlog on the machine that running quic fine right now might cause performance impact. And to mitigate this, you have to increase the budget.
Relatedly, I wonder how you are using the logged information concretely. No longer having specific timestamps and packet associations for individual ACKs/STREAM frames reduces debuggability in some ways. How are you currently processing the qlogs in practice to e.g., find bugs/inefficiencies or how do you plan to do so?
We are just starting to use qlog, and trying to see what we could use it for. Specific timestamps can be added using time delta just like other events. Currently we have a parser that parse our aggregated qlog back into standard qlog.
There might be some place I did not explain clearly, if you have any question, just shoot it. This also helps us a lot.
This custom optimization seems completely fine to do if people want to add the complexity. However, for the reasons outlined, I don't see a reason to try and design this into base qlog specification.
Perhaps the original poster has had time to learn from deployment since the issue was opened. But absent any followup I think this can be safely punted into a qlog extension.
Closing as timed out.