Safe API to reincorporate initialized/filled bytes from ReadBuf.take() sub-buffer
Is your feature request related to a problem? Please describe.
It would be nice if there was a better way to advance the "parent" buffer after using .take(). Consider the current implementation of AsyncRead for Take: https://docs.rs/tokio/1.46.1/src/tokio/io/util/take.rs.html#79
let n = b.filled().len();
// We need to update the original ReadBuf
unsafe {
buf.assume_init(n);
}
buf.advance(n);
Describe the solution you'd like
One idea would be some sort of API to close the "take" operation by "putting" the sub-buffer back:
buf.put(b); // or maybe put_take() or something?
That could consume b (which would give the API a really nice "bookends" feel), but it wouldn't need to, necessarily: a reference would be fine. The main thing it would do is to ensure that the b buffer is actually a result of calling buf.take() (ie: the referred memory regions are a sub-slice) and use the initialized/filled regions of b to update buf.
Describe alternatives you've considered
I really dislike unsafe code, so right now I'm just calling buf.initialize_unfilled() at the start and then later I can buf.advance(n), which is safe. This is suboptimal because of the unnecessary initialization.
Another possibility would be to update the .put_slice() API to notice if the slice being "put" is a subset of the existing buffer and behave accordingly. It feels a bit magic, though, so I think a dedicated API would be better...
Sorry, I don't fully understand your feature request. Would you mind writing some examples for your proposed feature including your workaround without this feature?. This could help me to understand your problem.
You may be thinking of what led to #4428. If so, then that's considered an API mistake but we can't fix it without a breaking change.
You may be thinking of what led to #4428. If so, then that's considered an API mistake but we can't fix it without a breaking change.
Right. I noticed that assert in there and had an inkling that it might be required to prevent evil reader implementations from producing UB via the unsafe block below (since the use of unsafe there could otherwise turn a logic error into a memory error). It's part of why I want to make this change, though: removing that unsafe block would also allow us to remove that assert because it should mean that UB is not possible in that case.
Here's an example of what I have right now:
fn poll_read(
self: Pin<&mut Self>,
cx: &mut task::Context<'_>,
buf: &mut ReadBuf<'_>,
) -> Poll<io::Result<()>> {
let this = self.project();
if *this.remaining == 0 {
return Poll::Ready(Ok(()));
}
// https://github.com/tokio-rs/tokio/issues/7446
buf.initialize_unfilled();
let mut slice = buf.take(usize::try_from(*this.remaining).unwrap_or(usize::MAX));
ready!(this.inner.poll_read(cx, &mut slice))?;
let n = slice.filled().len();
buf.advance(n);
if n == 0 {
Err(ErrorKind::UnexpectedEof)?;
}
// Update counters
*this.remaining -= n as u64;
this.progress.inc(n as u64);
Poll::Ready(Ok(()))
}
I'm just using .advance() because after the buffer is initialized then it's safe. I'd like to get rid of the buf.initialize_unfilled() though, but this creates problems: I can't .advance() the buffer unless the space I'm advancing into is initialized, and starting out, the buffer is uninitialized.
I would like my code to look like:
fn poll_read(
self: Pin<&mut Self>,
cx: &mut task::Context<'_>,
buf: &mut ReadBuf<'_>,
) -> Poll<io::Result<()>> {
let this = self.project();
if *this.remaining == 0 {
return Poll::Ready(Ok(()));
}
let mut slice = buf.take(usize::try_from(*this.remaining).unwrap_or(usize::MAX));
ready!(this.inner.poll_read(cx, &mut slice))?;
buf.put(slice); // merge the data back into the parent buffer
let n = slice.filled().len();
if n == 0 {
Err(ErrorKind::UnexpectedEof)?;
}
// Update counters
*this.remaining -= n as u64;
this.progress.inc(n as u64);
Poll::Ready(Ok(()))
}
There would be no change in the content of buf from the put() operation: only the "initialized" and "filled" pointers would be advanced according to the data in slice.
That the dance we're doing right now is necessary at all is considered a mistake, but it's of course possible that the dance could be improved. You can try and submit a PR that improves this logic and I can consider it.
I'll give it a try, but no promises. As I mentioned, I don't like unsafe, so I'm kinda bad at pointers and stuff 😅
So there's an issue that I hit pretty much right away: the suggested API won't work because take() returns a mutable ref on the original struct, which means you can't call a method on the original struct while the "taken" buffer is in scope, much less passing the buffer itself to that method....
Not sure how to work around that to be honest, without introducing a more invasive parent/child relationship...