julia icon indicating copy to clipboard operation
julia copied to clipboard

Iterators that know their size after start has been called

Open davidanthoff opened this issue 8 years ago • 9 comments

Another issue that has come up during the Query.jl design: I have a whole bunch of iterators that know their length after the start method has been called.

Would it be possible to add another return value to iteratorsize that is HasLengthAfterStart(), and if a type returns that it has to implement length(source, state)?

davidanthoff avatar Jun 21 '17 22:06 davidanthoff

See https://github.com/JuliaLang/julia/issues/8149 and https://github.com/JuliaLang/julia/issues/18823

tkelman avatar Jun 22 '17 00:06 tkelman

See #8149 and #18823

I assume this was just meant for cross-reference? Neither issue proposes something that would address the issue here.

davidanthoff avatar Jun 22 '17 03:06 davidanthoff

Related: #16708

mschauer avatar Jul 06 '17 11:07 mschauer

@davidanthoff can't this be handled with implementation of a stateful iterator? In the new protocol iterate(x) could then have your desired side effect.

laborg avatar Oct 17 '23 04:10 laborg

I think the new iteration protocol handles this yes.

KristofferC avatar Oct 17 '23 05:10 KristofferC

Hm, I might be missing something, but I don't think this is addressed with the new protocol? How would a source indicate to a client that length can be called after the first call to iterate?

If a source returns Base.HasLength from IteratorSize, then length has to work without a call to iterate. If it returns SizeUnknown then a client really has to assume that length can never be called. Neither case seems to cover what I'm after.

https://github.com/queryverse/IteratorInterfaceExtensions.jl#iteratorsize2 has an implementation of what I'm suggesting. That works OK for now, as this is essentially just used to trigger a performance optimization, but I still think it would make sense to have this in base itself.

davidanthoff avatar Oct 17 '23 17:10 davidanthoff

The contract for length is that it does not change during iterate, so it seems odd that calling iterate would make it available when it was not before

vtjnash avatar Oct 17 '23 20:10 vtjnash

FWIW though, I think the iteration protocol already expects that iterate has been called at least once before length is called for Base.HasLength, so that is already the expected definition for it

vtjnash avatar Oct 17 '23 20:10 vtjnash

The contract for length is that it does not change during iterate, so it seems odd that calling iterate would make it available when it was not before

The proposal here is not that length returns something different after iterate is called or that the return value would change during iteration. The proposal is that a source can signal to a client that length should not be called until iterate has been called once, i.e. it really is more of a signal that length is undefined behavior until a certain point in time.

WIW though, I think the iteration protocol already expects that iterate has been called at least once before length is called for Base.HasLength, so that is already the expected definition for it

Really? I certainly would not have guessed that at all from looking at the docs. Also, just very briefly looking through Julia base code, that does not seem to be how iterators are used, for example the code here would then be an incorrect consumption of an iterator, right?

Wouldn't that also be a really odd interpretation with the current stateless design of iterators? If length(iter) was only valid after a call to iterate(iter), then that would bake a mutating design into the iteration protocol that seems a bid odd? That is why in IteratorInterfaceExtensions.jl I added a new method signature for length that is length(iter, x), where x is the iteration state, for this scenario.

davidanthoff avatar Oct 18 '23 00:10 davidanthoff