iter
iter copied to clipboard
`\iter\isEmpty()` can indirectly cause unexpected consumption from iterators in rare cases
I've come across a rare case where the behavior of \iter\isEmpty() can indirectly cause unexpected consumption from iterators in normal use cases.
Specifically, in the case where isEmpty is provided an object of \IteratorAggregate, the object's getIterator() method is called. Any further uses of this object may call getIterator() a second time, so if this method is not a pure function then the first item(s?) we might expect from the iterator could be lost to the void.
I've put together what I believe is a minimal reproduction case here: https://gist.github.com/athrawes/e451484fdb4d0646f9aa03c9e47b566a
Basically, if the item passed to isEmpty looks like this:
class MyClass implements \IteratorAggregate {
public function getIterator(): \Traversable {
// some logic which consumes some resource
// or otherwise mutates global mutable state
while ($item = array_pop($someGlobalState)) { yield $item; }
}
}
then the following does not behave as expected:
$iterable = new MyClass();
\iter\isEmpty($iterable); // <- initially appears to behave as expected, items have not technically been consumed yet
foreach ($iterable as $item) { ... } // <- missing item(s?) from the front as we got a 'new' iterable with said items missing
Then interacting with that item in the future with any foreach constructs or any methods in this library will be missing some item(s?) from the front of the iterable.
I'm not sure if this is really a bug in this library per se, but it is rather unexpected - especially as the docblock on isEmpty claims to not consume items from iterators. Would this simply be a matter of warning users about this edge case, or is there some way to handle this here?
I don't think it's possible to do anything against this, apart from rejecting IteratorAggregate in isEmpty entirely, which doesn't seem desirable.
I'd consider a side-effecting IteratorAgggregate implementation to be a bug -- where did you encounter this?
The side-effecting \IteratorAggregate implementations turn out to be defined in my own codebases. It appears that once upon a time I wrote some convenience classes which look almost exactly like the following to abstract over popping items off a global (external) queue:
class MyClass implements \IteratorAggregate {
public function getIterator(): \Traversable {
// some logic which mutates global mutable state
while ($item = array_pop($someGlobalState)) { yield $item; }
}
}
So, to be clear, the buggy \IteratorAggregate classes are issues of my own design. That said, the docs for \IteratorAggregate don't seem to indicate that side-effecting implementations of getIterator() should be considered a bug or avoided for any reason.
It just so happens that there is a bit of a footgun here that I've encountered; I've shot myself in the foot with a valid (if not necessarily best practice) implementation of \IteratorAggregate being used with isEmpty(), where the combination causes side effects which were not clearly signposted anywhere.
I think I agree with you here that there's not much code-wise we can do here while still handling \IteratorAggregate; I'm wondering if it'd be best to simply add a note to the function documentation to call out this behavior?
The side-effecting
\IteratorAggregateimplementations turn out to be defined in my own codebases. It appears that once upon a time I wrote some convenience classes which look almost exactly like the following to abstract over popping items off a global (external) queue:class MyClass implements \IteratorAggregate { public function getIterator(): \Traversable { // some logic which mutates global mutable state while ($item = array_pop($someGlobalState)) { yield $item; } } }So, to be clear, the buggy
\IteratorAggregateclasses are issues of my own design. That said, the docs for\IteratorAggregatedon't seem to indicate that side-effecting implementations ofgetIterator()should be considered a bug or avoided for any reason.
Interesting case! The actual IteratorAggregate implementation here is side-effect free -- getIterator() will just return the Generator. The problem is that the call to valid() has a side-effect. For generators, the only way to determine whether there are any more elements is to actually complete the next one, and that has a side-effect in your case.
When working on an Iterator, this is fine, because even though valid() has a side-effect, it will not be executed a second time when calling current(). The computed value is stored. But for IteratorAggregate, we will execute the side effect twice.
I can't really blame you for this one -- I think your code looks perfectly reasonable, and there's really no other way to implement it short of a manual Iterator implementation, which could implement a side-effect-free valid() method.
I think I agree with you here that there's not much code-wise we can do here while still handling
\IteratorAggregate; I'm wondering if it'd be best to simply add a note to the function documentation to call out this behavior?
So yeah, I think a documentation note is probably the best we can do here.
I close my PR then if, it fixes the problem but it's not in really good way