itertools
itertools copied to clipboard
Laziness of our iterators
Similar to #601 but for all our iterators and iterator adaptors. Laziness reference: https://doc.rust-lang.org/std/iter/index.html#laziness
And I think each iterator struct should have the related must_use attribute:
- Adaptors:
#[must_use = "iterator adaptors are lazy and do nothing unless consumed"] - Non-adaptors:
#[must_use = "iterators are lazy and do nothing unless consumed"]
Load some items might be unavoidable in some case(?) but surely not all. TODO:
- [x] #794
- [x] #793
permutations(n)consumesnitems at definition (so not lazy whenn >= 1). - [x] #795
combinations(n)consumesnitems at definition (so not lazy whenn >= 1). - [x] #797
intersperse[_with]both consume 1 item at definition. - [x] #800
a.cartesian_product(b)consumes 1 item ofaat definition. Thereforeiproduct!(i_1, i_2, i_n)consumes 1 item of the firstn - 1iterators at definition. - [x] #801
coalesce,dedup[_by][_with_count]all consume 1 item at definition. - [x] #792
- [x] #767 mention laziness for new iterator adaptors and where to test it.
- [ ]
tuple_combinations::<T>consume 1 item per nested iterator at definition ("T::len()" - 1items so not lazy for(_, _, ...)). - [ ]
kmerge[_by]collect all items and consume 1 item of each iterator at definition.
About TupleCombinations and KMergeBy, the only way I see to make them lazy is to delay the initialization step with a LazyInit type (for e.g. based on Into conversion which is already done for the TupleCombinations part).
// `Option` (or a third variant) is a necessary implementation detail to take ownership of `U`.
// `Uninit(None)` (or the 3rd variant) would be unreachable.
pub(crate) enum LazyInit<U, I> {
Uninit(Option<U>),
Init(I),
}
impl<U, I> LazyInit<U, I> { // useful later for
pub fn new(uninit: U) -> Self; // at definition
pub fn get(&self) -> Result<&I, &U>; // size_hint
pub fn get_mut(&mut self) -> &mut I where U: Into<I>; // next, ...
pub fn take_init(self) -> I where U: Into<I>; // fold
// eventually
pub fn take_uninit(self) -> Result<U, I>; // count
}
The most important method is get_mut which handle initialization once and simply return the mutable reference of I otherwise.
I worry it might result in a slower iterator.
@jswrenn @phimuemue What do you think about such pattern/type?