http-body
http-body copied to clipboard
Question: can frame.into_data() be incomplete?
I use http-body
to parse the body of an endless Transfer-Encoding: Chunked
stream.
let frame = response.frame().await.expect("Stream ended").expect("Failed to read frame");
let Ok(data) = frame.into_data() else {
// frame is trailers, ignored
continue;
};
let decoded = serde_json::from_slice(&data)?;
// ...
But as I've discovered, under certain conditions data
is incomplete. When complete it ends in \n
.
To fix it I have a buffer that I only parse out the part of [0..(index of first b'\n']
and remove it from the buffer.
This leaves me with the following questions:
- Is this expected behavior from
Frame
? Having a partial piece in there? - Is the
\n
a left-over from the Chunked separator\r\n
?
Are you intending to buffer the whole response body? If so, then yes it might contain more than one frame. You can get the whole response using BodyExt::collect
:
body.collect().await?.to_bytes()
It's not that it's incomplete, but this a common misconception: writes from a peer do not equal the exact same reads locally. There are multiple things that can make a write get cut up into smaller pieces: TCP segment size, HTTP/2 DATA frame size, TLS record size, proxies/intermediaries.
You essentially want something like read_until()
. This requires buffering data, since each "frame" may not contain all the bytes you want.
Enough people have asked about this that it makes me think we could probably come up with a helper in http-body-util
.
Are you intending to buffer the whole response body? If so, then yes it might contain more than one frame. You can get the whole response using
BodyExt::collect
:body.collect().await?.to_bytes()
No, the body is endless.
It's not that it's incomplete, but this a common misconception: writes from a peer do not equal the exact same reads locally. There are multiple things that can make a write get cut up into smaller pieces: TCP segment size, HTTP/2 DATA frame size, TLS record size, proxies/intermediaries.
You essentially want something like
read_until()
. This requires buffering data, since each "frame" may not contain all the bytes you want.Enough people have asked about this that it makes me think we could probably come up with a helper in
http-body-util
.
Okay Frame
is a lower level than the CRLF-separated Chunk
.
Looking at the spec a little bit more: https://en.wikipedia.org/wiki/Chunked_transfer_encoding#Encoded_data it seems that my \n
detection is probably not correct and I need to do something a little bit smarter taking the chunk size into account.
@seanmonstar reading more I think I found where I got confused.
In HTTP2, which doesn't have Chunked
, but it has Frames: https://httpwg.org/specs/rfc7540.html#FrameTypes
So the name Frame
in http1
shorted my brain.