itertools icon indicating copy to clipboard operation
itertools copied to clipboard

Add a `MapWindows` Iterator adapter

Open Crazytieguy opened this issue 2 years ago • 9 comments

For problems that require working with windows of expensive to clone items, or large windows, tuple_windows isn't a good solution. Implementing a borrowed windows iterator is tricky if not impossible, but if the user provides a mapping (or folding, etc...) function that returns an owned value it becomes trivial. I decided to focus on MapWindows for this first issue, but other interesting adapters could be MapArrayWindows (with const generic), MapWindowsMut, FoldWindows and combinations of these concepts.

Here's an example Implementation. Since this could be used for large windows or windows of large items, I assumed shifting the items is also likely to be expensive. So I only drop old items every size iterations as a compromise between memory usage and speed. I'm not sure this is the best decision and am open to discussion :)

pub struct MapWindows<I: Iterator, F, T>
where
    F: FnMut(&[I::Item]) -> T,
{
    iter: I,
    f: F,
    size: usize,
    buf: Vec<I::Item>,
}

impl<I: Iterator, F, T> MapWindows<I, F, T>
where
    F: FnMut(&[I::Item]) -> T,
{
    pub fn new(mut iter: I, size: usize, f: F) -> Self {
        let buf = iter.by_ref().take(size - 1).collect();
        Self { iter, f, size, buf }
    }
}

impl<I: Iterator, F, T> Iterator for MapWindows<I, F, T>
where
    F: FnMut(&[I::Item]) -> T,
{
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        self.iter.next().map(|next| {
            if self.buf.len() == self.size * 2 - 1 {
                self.buf.drain(..self.size);
            }
            self.buf.push(next);
            (self.f)(&self.buf[self.buf.len() - self.size..])
        })
    }
}

Crazytieguy avatar Jan 11 '23 14:01 Crazytieguy

Hi there, I think I somewhat understand your idea.

If I understand correctly, this basically collects into a bounded, contiguous memory area one by one and exploits that a function will be applied to the slice. Please correct me if I'm wrong.

Wouldn't the canonical thing for this use-case be a VecDeque?

phimuemue avatar Jan 11 '23 21:01 phimuemue

I think you understood correctly :)

A VecDeque is definitely more canonical, but the issue with it is that it's not contiguous. I had some ideas on how to utilize a VecDeque, but I didn't like any of them. What do you think?

  • Call make_contiguous on every iteration (making it just a worst vector)
  • Give the closure a reference to the VecDeque instead of a slice
  • Give the closure an array of references, built with something like from_fn(|i| &self.buf[i]) (requires const generics)

Crazytieguy avatar Jan 12 '23 09:01 Crazytieguy

I think the ideal solution would be to not require the function f at all. (But I think it's not that easy resp. impossible if we do not have lending iterators.)

As to your question: My gut feeling would be to go with option 2 (give closure a reference to VecDeque), but I do not want to commit myself without having thought about it (or even having collected use cases).

phimuemue avatar Jan 12 '23 20:01 phimuemue

Makes sense. How do you usually collect use cases? This was my use case - an over optimized solution to Advent of code day 23 😅 https://github.com/Crazytieguy/advent-of-code/blob/master/2022/src/bin/day23/main.rs

Compared to using tuple windows this runs in about half the time

Crazytieguy avatar Jan 12 '23 21:01 Crazytieguy

A couple more thoughts:

  • It might be easier to find use cases for the mutable version
  • Do you know of any progress towards lending iterators? Will they allow something like windows_mut?

Crazytieguy avatar Jan 12 '23 23:01 Crazytieguy

How do you usually collect use cases?

I have a list of things that I needed in my private projects, and when things are discussed, I may bring items from that list to the discussion. So essentially as you did with your use case.

Do you know of any progress towards lending iterators? Will they allow something like windows_mut?

I do not closely track the progress on this. I found https://rust-lang.github.io/generic-associated-types-initiative/design_patterns/iterable.html, and via GADTs you can implement lending iterators on your own (see https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=bedd591aa7940e141a721eee4f572b80). However, this introduces another kind of Iterator that loses interoparability with Itertools and other existing functions.

I am not sure if there are plans change the Iterator trait to support lending in one of the next editions.

phimuemue avatar Jan 13 '23 07:01 phimuemue

  • Will they allow something like windows_mut?

Were you in a case where Cell would work? Because you can sometimes use https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells along with windows to get something similar to windows_mut.

scottmcm avatar Jan 13 '23 07:01 scottmcm

I think the playground example is cool - and I think maybe the best pattern would be to have a way to turn an Iterator2 back into an iterator, using a method like map the makes it no longer lending. So some new methods for iterators will return lending iterators and vice versa. The lending_iterator crate does basically that, but it has quite a clumsy api. I think I might try to write a similar crate myself 😁

Crazytieguy avatar Jan 13 '23 23:01 Crazytieguy

Well this probably no longer concerns itertools, but if you want I'd appreciate feedback on this, as a proof of concept. I'm trying to contact the creator of lending_iterator to see if they'd want to modify their crate to something like this

Crazytieguy avatar Jan 15 '23 09:01 Crazytieguy