ureq icon indicating copy to clipboard operation
ureq copied to clipboard

Return stream to pool when fully read but haven't reached EOF

Open jsha opened this issue 5 years ago • 5 comments

Followup from https://github.com/algesten/ureq/pull/160#issuecomment-699605597:

Hmm… I merged, but maybe we want to do something where it does return to pool if it is fully read?

I think the case you're talking about here is "user already knows externally that response is exactly N bytes, so they read N bytes exactly, and never read an EOF." That was actually the case in the git2-rs change I've been working on: the git smart HTTP transport has internal length information in the response bodies, so libgit was stopping its reads right before reading off the last chunk. Right now in that case the stream is not returned to the pool.

One way we could support this: restore the impl Drop, but instead of returning the connection to the pool, do a 1-byte read. That would trigger the PoolReturnRead's EOF detection, which would return the stream to the pool. The problem is, in the case where there's nothing to read, this would block. I'm pretty sure we don't want that. We could do a non-blocking peek, like in serverclosed_stream, but that would mean punching through a bunch of layers - PoolReturnRead and (ChunkDecoder | LimitedRead).

It may not be possible to solve this in the general case, because for instance a chunked stream might be "done" but still need to read off the last chunk (0\r\n\r\n), and I don't think we want to do real I/O in a drop. Perhaps for the Content-Length case, though, we can know that we're at the end by the number of bytes read. And in the chunked case, we could use read bytes only so far as they are already in the buffer.

Also, there's an easy fix on the user side, which is to read body all the way through to EOF.

I think this probably adds more complexity than it's worth and am inclined to wait for someone to ask for it, but could definitely be convinced otherwise.

jsha avatar Sep 27 '20 17:09 jsha

Punching through the layers sounds like a bit messy, and I agree we should not do any I/O in the drop, which means we could do this for two cases, for chunked when the user also reads 0\r\n\r\n, and for content-length when the user reads the entire length.

algesten avatar Sep 27 '20 18:09 algesten

we could do this for two cases, for chunked when the user also reads 0\r\n\r\n, and for content-length when the user reads the entire length.

For chunked, if the user reads as far as the 0, chunked_transfer reads the following \r\n\r\n and returns EOF: https://github.com/frewsxcv/rust-chunked-transfer/blob/1.2.0/src/decoder.rs#L128-L145. So I think it's not possible to do anything special for chunked. Agreed that we can do this for content-length, when LimitedRead's limit == position.

jsha avatar Sep 27 '20 19:09 jsha

For chunked, if the user reads as far as the 0, chunked_transfer reads the following \r\n\r\n and returns EOF: https://github.com/frewsxcv/rust-chunked-transfer/blob/1.2.0/src/decoder.rs#L128-L145.

Reading until getting the 0 shouldn't be a problem for the underlying reader/stream. Those should be reusable. Or is there something I'm missing?

algesten avatar Sep 28 '20 08:09 algesten

I think I may have misunderstood what you are saying. I agree chunked transfer streams are reusable if the user reads all the way to EOF (which means the ChunkDecoder reads the final 0\r\n\r\n and returns Ok(0)). That already works properly, and the stream gets returned to the pool by PoolReturnRead's read() method, not the drop impl we recently deleted.

However, if the user reads everything but the EOF (meaning the ChunkDecoder doesn't read the final \0\r\n\r\n), we can't return it to the pool without doing additional I/O to read that final chunk. The one possible avenue would be to become more buffering-aware, and read the final chunk only if it happened to be in the buffer.

jsha avatar Sep 28 '20 17:09 jsha

@jsha aaaaahhh.. I'm misunderstanding. We are already returning to pool correctly. The drop trait is only needed for content-length cases where the user read exact and never encountered the 0 read.

algesten avatar Sep 29 '20 08:09 algesten