Some minor grammar fixes, make squeue's `push` functions consume each `entry` instead of cloning it, and make `push_unchecked` public
It is strange to take a reference to an entry and then immediately cloning it, it should be at the developers' discretion whether or not they would like their entry cloned. For low-latency operations this unnecessary cloning can be expensive and is completely unnessecary as far as I can tell.
I don't know why push_unchecked was ever private; it should again be at the developer's discretion whether or not to use unchecked functions.
I initially thought that it had something to do with keeping the integrity of the Entry intact, that dropping the entry would somehow make it unviable, but the clone is obviously immediately dropped anyway...
I've also noticed that the Iterator syntax of the CQ is almost completely useless because there is no way to sync the CQ after it's been exhausted, since using it as an IntoIterator consumes it.
This is why I've made pop public...
For low-latency operations this unnecessary cloning can be expensive
Do you have any benchmarks proving this? we need to copy the entry from application's memory (either stack or heap) to ring, which is always an unavoidable memcpy. Using move does not change this.
If you want to optimize push entry, should look at this https://github.com/tokio-rs/io-uring/issues/116
I don't know why push_unchecked was ever private; it should again be at the developer's discretion whether or not to use unchecked functions.
Do you have any need to use it?
Also, please don't put too many unrelated changes in one PR.
I apologize for the PR overloading.
Perhaps performance wasn't the right way of proposing this change. It is trivially obvious that this cloning paradigm will only ever create more instructions for the processor to execute, and will only ever incur more copying than moving the value if the compiler cannot optimize away the clone (this optimization is what would be observed in the currently implemented tests).
Any test using Nop can trivially optimize away the clone because cloning a unit struct is free. The other tests include the reference inside an unsafe block, and so the compiler knows the original value (to which the reference refers) is immediately dropped, therefore the clone can be optimized to a move anyway.
That is not to say that the memcpy of a 64-bit struct is not incredibly cheap and will almost certainly not have any impact on prod code.
That is why perhaps the more important result of changing it to a consuming paradigm is removing developer confusion. I know personally I sat and scratched my head for several minutes wondering what the importance of this borrow was: "Do I have to ensure the contents of the Entry remain intact while the kernel can read it?". Of course, the answer is no, because the Entry is immediately cloned and is therefore immediately unlinked from the Entry that was passed into the function.
It is my argument, therefore, that it should be a conscious decision of the developer to keep a copy of the Entry, not the default behaviour of push.
As for push_unchecked, I am implementing a Promise architecture around the io_uring, and to reimplement push_multiple for this purpose, I must implement my own push_unchecked_promise so that I don't have to bounds check for every element that I am pushing. This implementation requires push_unchecked to be public.
I'm willing to expose push_unchecked if there's a use case for it.
I don't see any real benefit in using move instead of reference, and it would break the API, so it won't be accepted any time soon.
I think by make push_unchecked public, push_multiple can be deprecated. so push can accept E instead of &E. This can do in next version.
I propose that push_multiple is still a useful abstraction because it doesn't need to check for every element if there is room in the SQ.
I think perhaps a useful change to it would be to allow any generic which is an exact size iterator, or precisely:
#[inline]
pub unsafe fn push_multiple<T, I>(&mut self, entries: T) -> Result<(), PushError>
where
I: ExactSizeIterator<Item = E>,
T: IntoIterator<Item = E, IntoIter = I>,
{
let iter = entries.into_iter();
if self.capacity() - self.len() < iter.len() {
return Err(PushError);
}
for entry in iter {
self.push_unchecked(entry);
}
Ok(())
}
The only change this induces is it makes the input push_multiple(vec[..3]) invalid, but you can make the equivalent behaviour with this syntax: push_multiple(vec.drain(..3)) (you just need to make vec mutable).