itertools
itertools copied to clipboard
Feature request: repeat final element
It would be nice to have an iterator wrapper that repeats the final element of underlying iterator.
Examples:
- empty:
[]->[] - single element:
[1]->[1, 1, 1, ...] - multiple elements:
[1, 2, 3]->[1, 2, 3, 3, 3, ...]
This would require Clone to be implemented for the item type. Ideally the wrapper would try to only start to clone elements once it reaches the end of the underlying iterator.
How's this for an implementation? https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=995803e11d0a9af8ba8611832f264ada
I think you shouldn't call self.iter.next() after you've received None once, see:
Returns
Nonewhen iteration is finished. Individual iterator mplementations may choose to resume iteration, and so callingnext()again may or may not eventually start returningSome(Item)again at some point.
(https://doc.rust-lang.org/std/iter/trait.Iterator.html#tymethod.next)
However I can see that this now depends a bit on how this requested feature should be implemented exactly. There's a difference between:
- Repeat the same element after the underlying iterator finished for the first time.
- Repeat the last element emitted by the underlying iterator.
I think the 2nd option is easier to understand and implement at first, but I personally find it somewhat confusing that iterators can resume their work and how this would interact with this specific wrapper.
The simple solution to that is to just fuse the inner iterator before hand. That way it never resumes once next() returns None
I just stumbled on this issue as I have a good use case for it: a timeout that increases with given steps until it reaches a maximum:
let mut retry_timout = [10, 50, 100, 200]
.into_iter()
.chain(iter::once(400).cycle());
let val = loop {
match tokio::timeout(retry_timeout.next().unwrap(), some_operation) {
Ok(val) => break val;
Err(_timeout) => continue;
}
}
With this feature (lets call it cycle_last()) we could write this a bit more readable as:
let mut retry_timout = [10, 50, 100, 200, 400]
.into_iter()
.cycle_last();
- Repeat the same element after the underlying iterator finished for the first time.
- Repeat the last element emitted by the underlying iterator.
@crepererum I don't understand the difference between these two, could you clarify it? maybe with an example?
I would be happy to pick this up and try a PR.
a timeout that increases with given steps until it reaches a maximum
I'd be tempted to phrase that far more directly, perhaps as
itertools::iterate(25, |x| (x * 2).min(400))
fn timeouts() -> impl Iterator<Item = u32> {
let mut s = 25;
std::iter::repeat_with(move || {
let a = s;
s = (s * 2).min(400);
a
})
}
I didn't know about itertools::iterate thanks for that! It gives a nice one liner combined with min. It should work very well for the general timeouts use-case. I needed the jump in timeout duration to be quite large which is why I am using an array as a starting point. For that specific use-case I still really like the idea of cycle_last().
I guess the other thing I'll note is that this
.chain(iter::once(400).cycle());
can be simplified to
.chain(iter::repeat(400))
Or, with itertools, the whole thing is
chain!([10, 50, 100, 200], repeat(400))
which, TBH, feels really good to me.
I guess my meta-point is that if the last value is known, then chain+repeat feels like it expresses the intent perfectly well.
So I guess to motivate this addition more, I'd like to see something where the last one is unknown and particularly special. (Especially since adapters that need to remember the previous value have uncertainty in implementation approach about cloning or pre-consuming values.)
which, TBH, feels really good to me.
I guess my meta-point is that if the last value is known, then chain+repeat feels like it expresses the intent perfectly well.
That is great, gonna use that from now on! Thanks a lot for showing me.
Given what you have showed me I think you are absolutely right there is more motivation needed. It is now clear what the properties of a use-case should be, a good place to leave this feature request until someone finds such a case.