itertools
itertools copied to clipboard
New iterize() function?
Hi folks. Is the iterize() function demonstrated below of any interest? It's rather like itertools::iterate(), making it easy to create a certain class of iterators via something like induction. But for the cases it covers (the state must be Copy, and should be small), it offers 1) a simplified API and 2) improved performance.
UPDATE: I take that back. In some cases the performance is better, but in other cases it is worse.
For example, both of these produce the Fibonacci sequence:
iterate((1,1), |&(a,b)| (b, a+b)).map(|p| p.0)
iterize((1,1), |(a,b)| (b, a+b))
You specify an initial state and a rule for converting the Nth state into the (N+1)th state. At least for the Fibonacci sequence iterize seems to have a slight edge on all of the other implementations (including itertools::iterate) offered in this forum thread (although those benchmarks do more than just iterate):
https://users.rust-lang.org/t/any-possible-improvements-to-this-functional-program/39085/61
I haven't contributed before, and am still somewhat new to Rust, but wondered whether this might be of interest for itertools. Here's the demo:
fn main() {
dbg!(fib().take(6).collect::<Vec<i32>>());
}
// Returns an Iterator for the Fibonacci sequence: 1 1 2 3 5 8 ...
fn fib() -> impl Iterator<Item = i32> {
iterize((1,1), |(a,b)| (b, a+b))
}
// Produces an Iterator by induction.
// Given an initial state of type (R,S) and a function that produces
// the next state from an existing state, we return an Iterator for the Rs.
// So in (R,S), R is the part that gets (R)eturned by the Iterator,
// and S is any additional (S)tate used internally.
fn iterize<R: Copy, S: Copy>(s0: (R,S), f: impl Fn((R,S)) -> (R,S)) -> impl Iterator<Item = R> {
let mut state = s0;
std::iter::repeat_with(
move || { state.swap(f(state)).0 }
)
}
// a.swap(b) sets a to b and returns the old value of a.
// It is better than using std::mem::replace() directly in many cases
// because 1) you don't have to explicitly say &mut, and 2) it saves
// you having to make a temporary variable when the new value
// is computed from the old value (because of two-phase mutable borrows).
pub trait Swap: Sized {
fn swap(&mut self, value: Self) -> Self;
}
impl<T> Swap for T {
fn swap(&mut self, new: Self) -> Self {
std::mem::replace(self, new)
}
}
The benchmarks for the implementations in that forum thread are here: https://github.com/NicholasSterling/rust_functional_fibo See especially the "iterators" branch.
It just occurred to me (duh) that you could make this work like
iterize((1,1), |a,b| (b, a+b))
rather than
iterize((1,1), |(a,b)| (b, a+b))
although I haven't tried it yet to see whether that will impact performance.
The obvious question to me is whether it's needed vs the other things that currently exist, like
- https://docs.rs/itertools/0.10.0/itertools/fn.unfold.html
- https://doc.rust-lang.org/nightly/std/iter/fn.from_fn.html
How hard is the example to encode in one of those instead?
Here's a four-way showdown between some of the current options (playground):
pub fn with_unfold() -> impl Iterator<Item=u32> {
use itertools::unfold;
unfold((0, 1), |(current, next)| {
*next += *current;
*current = *next - *current;
Some(*current)
})
}
pub fn with_from_fn() -> impl Iterator<Item=u32> {
use core::iter::from_fn;
let mut current = 0;
let mut next = 1;
from_fn(move || {
next += current;
current = next - current;
Some(current)
})
}
pub fn with_iterate() -> impl Iterator<Item=u32> {
use itertools::iterate;
iterate((0, 1), |(prev, current)| {
(*current, current + prev)
}).map(|(_, current)| current)
}
pub fn with_successors() -> impl Iterator<Item=u32> {
use core::iter::successors;
successors(Some((0, 1)), |&(prev, current)| {
Some((current, current + prev))
}).map(|(_, current)| current)
}
fn main() {
use itertools::Itertools;
println!("{:?}", with_unfold().take(10).format(","));
println!("{:?}", with_from_fn().take(10).format(","));
println!("{:?}", with_iterate().take(10).format(","));
println!("{:?}", with_successors().take(10).format(","));
}
They all feel similar enough to me that I'm having a hard time meaningfully contrasting them.
Another thought: I might like this better if it had a name more evocative of how it turns the closure into an iterator.
Like I'm a fan of unfold because I understand how fold works, and can use that to understand unfold.
I agree that it isn't really that hard to do it with any of a number of existing APIs, as @jswrenn so effectively demonstrated! Of those solutions, the iterate() version seems the most straightforward. And while there might be performance differences, they probably aren't huge. Perhaps because I was (still am, really) getting to know Rust, I was thinking of it from the point of view of a Javascript or Python programmer checking out Rust, a language they had heard was hard. Perhaps for them the difference between these would matter:
iterate((1,1), |&(a,b)| (b, a+b)).map(|(a,b)| a)
induce((1,1), |a,b| (b, a+b))
But I agree, for anyone familiar with the language the difference is not so large. Unless this would fit a lot of simple cases and perform well, perhaps it's just not worth doing. Thanks for taking the time to consider it.
No big performance gain expected and there are multiple alternatives. Therefore, I'm closing this.