chapel
chapel copied to clipboard
Dyno: eagerly build iterator groups (in the scope where they're defined)
This PR changes the way that iterators are resolved to stop considering the scope in which the iterator is iterated over, and only consider the scope in which it was defined / created. This came out of a discussion with @benharsh , @bradcray and @mppf about the expected semantics of iterators. I wanted to get these changes (which are largely architectural in nature) out of the way before proceeding with other iterator-related tasks such as promotion.
In production, each time an iterator is used, the compiler performs a search for matching overloads. E.g,, if a serial iterator is defined in one place, then a standalone overload is defined in the scope of a loop, the standalone loop will be preferred.
iter myIter() {
yield 1;
}
{
iter myIter(param tag) where tag == iterKind.standalone {
yield "hello";
}
[i in myIter()] {
writeln(i); // prints "Hello"
}
}
However:
iter myIter() {
yield 1;
}
{
iter myIter(param tag) where tag == iterKind.standalone {
yield "hello";
}
}
[i in myIter()] {
writeln(i); // prints 1
}
There are all sorts of other fun cases in which the type of the iterator can change depending on what is being zippered with, since the follower is re-resolved using the new leader's type:
iter myIter() do yield 1;
iter myIter(param tag) where tag == iterKind.leader do yield (1..1,);
iter myIter(param tag, followThis) where tag == iterKind.follower do
yield if followThis.type.size == 1 then 1 else "hello";
iter otherIter() do yield 1;
iter otherIter(param tag) where tag == iterKind.leader do yield (1..1, 1..1);
iter otherIter(param tag, followThis) where tag == iterKind.follower do yield 1;
forall i in myIter() do
writeln(i); // prints 1
forall (_, i) in zip(otherIter(), myIter()) do
writeln(i); // prints "hello"
This PR changes this process in Dyno, making a switch to determining the types of the leader, follower, and standalone iterators associated with a particular function using the scope in which the function is declared, as opposed to the scope in which it's being used. This is somewhat complicated by the existence of point-of-instantiation scopes; generic iterators do include their isntantiation information, so that other overloads are resolved with the same instantiation scope in mind. However, this is determined at the point where the iterator is created, as opposed to the point where the iterator is used in a loop.
As of this PR, both version of the bracket loop programs print i
, and the program in which the leader type affects resolution fails to resolve (because only once follower type is allowed per iterator).
Changes in This PR
Bundling PoI scope with iterators
The main trick with this PR was to store enough information in an IteratorType
so that we can reconstruct the context in which it was created. This way, given an iterator type, we have all the information we need to find its various overloads. The most complicated situation in this case is when the iterator was constructed from a call to a generic iter
procedure; in this case, the body of the procedure -- and presumably the other overloads for the iterator -- can use functions from the context of the call, via the point-of-instantiation mechanism. This allows programs like the following to resolve:
iter myIter(arg) {
yield computeReturn(arg);
}
iter myIter(arg, param tag) where tag == iterKind.standalone {
yield computeReturn(arg);
}
{
proc computeReturn(arg: int) {
return arg + 1;
}
forall i1 in myIter(13) {}
for j1 in myIter(13) {}
}
{
proc computeReturn(arg: int) {
return arg == 1;
}
forall i2 in myIter(13) {}
for j2 in myIter(13) {}
}
In the first block, myIter
is instantiated in a scope in which computeReturn
returns int
, which means myIter
-- and all of its versions, including the parallel one -- should return int
as well. In the second block, another computeReturn
produces bool
, so myIter
invocations there should create iterators that yield bool
. Iterator values can be returned and passed around to different contexts (changing PoI), which to me suggests that the iterator type itself should include the point-of-instantiation information.
This PR takes that approach, storing a PoI scope with the iterator.
Circumventing CallInfo
etc. for invoking these
overloads
I noticed that we use "CallInfo" and friends as an odd intermediate representation. The resolver knows that it's handling a special case of resolving these
using a tag
and optionally a followThis
argument; the resolution queries need to specially handle calls to these
when they are made, so that they can provide compiler-backed implementations for iterator types. However, to get from the resolver to the resolution queries, prior to this PR, we went through resolveGeneratedCall
, which required the construction of actual types etc.. Then, resolution queries had to unpack the CallInfo, inspect the formals, and call in to the special case.
This PR circumvents this by introducing a new IteratorKind
enum, which encodes the type of these
overload we're looking for (serial, standalone, leader, etc.). A new query in resolution-queries
is then defined that handles the special case right away, before constructing a CallInfo
. Thus, special overloads (like calls to these
on a loop expression) are handled without the intermediate packing and unpacking of actual types into a CallInfo
object. This considerably simplifies the resolution process in my opinion.
Issuing Errors for Incompatible Iterator Types
When a call to an iterator is made, the resolver immediately (and eagerly) searches for all other overloads (leader, follower, standalone, etc.). This is possible since the types of these overloads are determined when they are created. By doing this eager resolution, we can also report incompatible yield types. Now -- as discussed with @bradcray et al. -- the serial, standalone, and parallel iterators must all have the same type.
Testing
- [x] dyno tests, including new ones of the programs discussed in this PR OP