iter
iter copied to clipboard
Feature Request: improve performance by introducing size hint
Currently, the implementation of the Collect
doesn't reserve capacity for iterator:
https://github.com/mtoohey31/iter/blob/7eec6de0d2ad032b51126c2df2f68a75dfde5d2a/iter.go#L51
But actually, iterator can provide size hint to indicate how much memory it needs. Rust adopt such implementation:
pub trait Iterator {
type Item;
// Required method
fn next(&mut self) -> Option<Self::Item>;
fn size_hint(&self) -> (usize, Option<usize>);
// ...
}
With the size_hint
method, iterator in rust can indicate how much memory it needs, which can improve performance in methods like Collect
.
An possible adoption of size_hint
is to change the definition of iter.Iter[T]
to something like:
type SizeHint struct {
LowerBound int
UpperBound tuple.T2[int, bool]
}
type Iter[E any] struct {
next func() (E, bool)
sizeHint SizeHint
}
I think a more rust-like interface is definitely worth considering, but at this point I haven't made the switch yet because I'm not sure that the benefits outweigh the extra complexity.
As far as I can tell, with the current Go compiler, this potential improvement to the Collect
method would be the only benefit to such a switch. I'm also not entirely convinced that this would lead to better performance in all cases either since slices are grown by a factor of their current size when being resized, so the existing implementation's runtime is still linear in terms of the input iterator's size. In addition, we already have CollectInto
, which partially fills this gap in cases when the user can trivially determine the length of the output iterator, since they can allocate their own slice of the correct length.
That being said, in the future a switch to the following sort of interface might very useful:
type SizeHint struct {
Lower uint
Upper Opt[uint]
}
type Iter[T any] interface {
Next() (T, bool)
SizeHint() SizeHint
}
func Collect[T any, I Iter[T]](i I) []T { ... }
...if the Go compiler improves its generics support enough to the point where I's method calls can be inlined within Collect
(and other functions declared similarly), since that should technically be possible. I tried playing with a re-implementation like this in Compiler Explorer not long ago, but the compiler didn't successfully inline anything at that time. Making Collect
a function would also be necessary for this to work because of the lack of support for generic methods, which would be unfortunate in terms of readability.
Based on the above, I'm going to leave this ticket open for now.