rfcs
rfcs copied to clipboard
drain_filter_take
Currently we have the unstable Vec::drain_filter:
pub fn drain_filter<F>(&mut self, filter: F) -> DrainFilter<'_, T, F, A>
where
F: FnMut(&mut T) -> bool;
My use case is that I want to perform some side effects in filter, where the side effects would determine whether the item should be drained, but performing the side effects requires moving out the T if the element is to be filtered.
I propose a new method drain_filter_take that takes the form
pub fn drain_filter_take<F>(&mut self, filter: F) // does not return an iterator
where
F: FnMut(T) -> Option<T>;
(I am not sure what a good name would be that clearly indicates it does not return an iterator)
If Some is returned, it replaces the original value; if None is returned, the value is drained. This allows the filter to take ownership of T and only return it if not drained.
I think it would also be useful to be able to drain a vector by mapping a function that takes each item by value and decides whether to yield a transformed version of it or put it back, kind of like filter_map:
pub fn drain_map_ok<U, F>(&mut self, filter: F) -> DrainMapOk<'_, U, T, F, A>
where
F: FnMut(T) -> Result<U, T>;
So all of the Oks would be iterated while all the Errs would end up in the original vector.
My specific use case for this is extracting all of the literals from a list of expressions in the optimization pass of my compiler.
@Johan-Mi originally I wanted to call it a drain_filter_map, but turns out that removed items are already moved out in the closure, so it doesn't make sense to return an iterator here.
oh, I see you're talking about a Result not an Option. But it'd be weird to have it Ok/Err, isn't it? I get your motivation of that mapping is the Ok variant and retaining is the Err variant, but it sounds unnecessarily complex to use.
You're right, perhaps it would be clearer with a different return type for the callback, with enum variants named something like Take or PutBack. It is unfortunate to have to deal with putting the values back if you don't want to keep them but Rust doesn't really provide any good way to optionally consume a value.
Why not use vec.into_iter().filter_map(...).collect()?
@the8472 That consumes the entire original vector instead of only removing certain elements.
the collect output contains the remaining elements which you can then continue to use
If you choose to collect the remaining elements then you'll lose the other elements—or you could do it the other way around. Either way you lose one half of the input, making it useful only for side effects.
The proposed method pub fn drain_filter_take<F>(&mut self, filter: F) // does not return an iterator does discard half of the elements, which is why I suggested the collect approach.
I suppose I should turn my proposal into a separate issue, I don't have much experience with RFCs and only added it here since it seemed somewhat related. To clarify, what I'm suggesting is a different method that does return an iterator.
@the8472 doesn't that reallocate the whole vector in a new buffer?
Hmm... rust vectors are not linked lists, so if you want to take ownership of an element in the middle of a vector an "drain" it out if the vector, then you have to describe how the memory structure if the vector looks afterwards. One idea is to do something like swap_remove.
The actual drain internals use some unsafe trickery where they decrease the length of the vector before creating the iterator and then use some raw reads, which seem to be guaranteed to be safe since the iterator has a mutable reference to the vector. I'm not too familiar with the details but it seems like the methods proposed by @SOF3 and I should both be doable with zero allocations.
@the8472 doesn't that reallocate the whole vector in a new buffer?
It's not guaranteed, but currently the reallocation is optimized away.
oh, I see you're talking about a Result not an Option. But it'd be weird to have it Ok/Err, isn't it? I get your motivation of that mapping is the Ok variant and retaining is the Err variant, but it sounds unnecessarily complex to use.
A simple alternative would be just creating an enum with appropriate cases, e.g.
enum FilterMapInnerResult<T, U> {
/// element gets put back into the vector
Keep(T),
/// element gets yielded (and maybe transformed),
/// and efficiently removed from the vector
Yield(U),
}