icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Add explicit ZeroVec and VarZeroVec iterator types

Open sffc opened this issue 1 year ago • 4 comments

The standard library exports concrete types for iterators. However, our ZeroVec types only return impl Iterator. This limits a certain amount of flexibility; for example, you can't put multiple iterators together into a list because the compiler doesn't know that the impl Iterators are going to be the same type.

I also question whether impl Iterator is highly efficient in terms of code size and performance. I remember pulling apart some spaghetti iterators into procedural code and getting a boost. Returning a concrete type could make code simpler to compile.

Thoughts? @Manishearth

sffc avatar Dec 22 '23 00:12 sffc

This was a deliberate choice so as not to overcommit to the implementation of the iterator, especially since we are returning adapters.

We can write explicit adapters, however:

  • we should make them opaque types like ZeroVecSliceIter
  • we need to be wary of perf losses since doing this transform requires somewhat unrolling the iterator chain, which is sometimes harder to optimize without unsafe

you can't put multiple iterators together into a list because the compiler doesn't know that the impl Iterators are going to be the same type

The compiler should allow you to do this as long as both iterators come from the same method. It's intermediate methods that cause this problem.

I also question whether impl Iterator is highly efficient in terms of code size and performance. I remember pulling apart some spaghetti iterators into procedural code and getting a boost. Returning a concrete type could make code simpler to compile.

This is simply not the case; that's not how impl Iterator works. I suspect your boost was due to making things procedural, not because of impl Iterator vs explicit iterators.

Manishearth avatar Dec 22 '23 01:12 Manishearth

The compiler should allow you to do this as long as both iterators come from the same method. It's intermediate methods that cause this problem.

Yes and I have intermediate methods. I worked around the problem in my datetime PR by using the as_ule_slice() escape hatch, which I am glad we expose, instead of ZeroVec's own iterators.

I suspect your boost was due to making things procedural, not because of impl Iterator vs explicit iterators.

What I was referring to is https://github.com/unicode-org/icu4x/pull/2603. What I saw in the bytecode while working on that change was a mess of iterator abstractions (zip, chain, ...) which went away when I rewrote the code using a for loop.

I wanted to discuss the ZeroVec iterators because they are currently written as

    pub fn iter(&self) -> impl DoubleEndedIterator<Item = T> + ExactSizeIterator<Item = T> + '_ {
        self.as_ule_slice().iter().copied().map(T::from_unaligned)
    }

That impl is probably fine because I think copied and map are most likely zero-cost abstractions. My problems previously were specifically with zip and chain which are not zero-cost.

sffc avatar Dec 22 '23 18:12 sffc

Also it makes it trickier to use with trait types that need a concrete type. I think a very recent Rust addresses this (https://github.com/rust-lang/rust/issues/91611) but it is not MSRV yet.

EDIT: Actually it is https://github.com/rust-lang/rust/issues/63063 which is not stable yet.

sffc avatar Dec 22 '23 19:12 sffc

I wanted to discuss the ZeroVec iterators because they are currently written as

Ah, that's a different problem than impl Trait.

I would measure this. 99% of the time iterator chains are more efficient than loop iteration.


Either way, I'm in favor of using concrete named types; I just didn't choose to do this initially. We should be very careful to give them a limited API though.

Manishearth avatar Dec 23 '23 16:12 Manishearth