Foreach processing for types implementing IEnumerable<T> is problematic
§13.9.5 The foreach statement of the current draft-v9 says that when processing foreach, and after not finding a GetEnumerable method (in practice, this happens with explicit interface implementation), we do the following:
If among all the types
Tᵢfor which there is an implicit conversion fromXtoIEnumerable<Tᵢ>, there is a unique typeTsuch thatTis notdynamicand for all the otherTᵢthere is an implicit conversion fromIEnumerable<T>toIEnumerable<Tᵢ>, then the collection type is the interfaceIEnumerable<T>, the enumerator type is the interfaceIEnumerator<T>, and the iteration type isT.
I think relying on implicit conversions here is problematic. Also, it's not what Roslyn actually does, resulting in differences between the specification and the implementations.
In particular:
-
The specification allows implementing
IEnumerable<T>multiple times and directly using such type inforeach, when one of the type parameters inherits from the other:foreach (var x in new MyCollection()) { } class MyCollection : IEnumerable<string>, IEnumerable<object> { IEnumerator IEnumerable.GetEnumerator() => throw new NotImplementedException(); IEnumerator<string> IEnumerable<string>.GetEnumerator() => throw new NotImplementedException(); IEnumerator<object> IEnumerable<object>.GetEnumerator() => throw new NotImplementedException(); }According to the spec, the unique type
There should bestring, since there is an implicit covariant conversion fromIEnumerable<string>toIEnumerable<object>. The compiler does not allow this code:error CS1640:: foreach statement cannot operate on variables of type 'MyCollection' because it implements multiple instantiations of 'IEnumerable<T>'; try casting to a specific interface instantiation
-
The specification results in the wrong iteration type for a type that implements
IEnumerable<dynamic>:foreach (var x in new MyCollection<dynamic>()) { } class MyCollection<T> : IEnumerable<T> { IEnumerator<T> IEnumerable<T>.GetEnumerator() => throw new NotImplementedException(); IEnumerator IEnumerable.GetEnumerator() => throw new NotImplementedException(); }According to the spec, the unique type
There should beobject, since the type is implicitly convertible to bothIEnumerable<object>andIEnumerable<dynamic>, butdynamicis explicitly forbidden. The compiler determines the iteration type to bedynamic, as expected.
It seems to me that this could be resolved by following the compiler, and specifying this in terms of the interfaces actually implemented by the type in question, not based on implicit conversions. But I don't know whether that's viable.
The specific commit that is linked as "current for v9" has not appeared in the v9 branch, but this would be relevant work for v8 which is still in draft.
For your first point, I'm curious how much interest there would be in making this a compiler bug. I can see the rationale in allowing iteration here under the assumption that there is really no confusion about "which sequence" you're getting. The more-derived sequence should be the same instances as the less-derived. If there is any confusion, that's poor type authoring.
This text is word for word the same as the v5 spec, but the compiler implements the behavior that matches the v4 spec. Like, the spec was changed in v5 here, but the compiler seems to have ignored this change. v4 text:
- If there is exactly one type
Tsuch that there is an implicit conversion fromXto the interfaceSystem.Collections.Generic.IEnumerable<T>, then the collection type is this interface, the enumerator type is the interfaceSystem.Collections.Generic.IEnumerator<T>, and the element type isT.
For your second point, it certainly seems the compiler's behavior is the desirable one. It's not actually possible to implement IEnumerable<dynamic> directly, so this would always be a case involving substituted type parameters.
This raises further questions for me. Let's say we fix dynamic, but then we still have the issue of being convertible to both IEnumerable<(int A, int B)> and IEnumerable<(int, int)> simultaneously. Or IEnumerable<string> and IEnumerable<string?>. Or all the ways that this combines with nesting:
-
IEnumerable<((int A, dynamic B), int C)> -
IEnumerable<((int A, object B), int C)> -
IEnumerable<((int A, dynamic? B), int C)> -
IEnumerable<((int A, object? B), int C)> -
IEnumerable<((int, dynamic), int)> -
IEnumerable<((int, object), int)> -
IEnumerable<((int, dynamic?), int)> -
IEnumerable<((int, object?), int)> - and many more permutations I've left out!
It's an interesting suggestion to talk about interfaces implemented by the type instead of conversions. I'm curious what others think who are more familiar with what this change would entail. If I'm following, the benefit of this change would come from saying that IEnumerable<object> would not be considered to be implemented by the type in your example, but only IEnumerable<dynamic>.
I raised https://github.com/dotnet/roslyn/issues/76598 making the case for following the v5 (to present) spec on point 1.
Response from the compiler is that it's been a decade with next to no customer complaints, so perhaps the best course of action is to change the spec, as @svick originally suggested.
The quoted text from draft-v9 is the same in v7 and draft-v8 so this is a current issue.
This is not the most elegant of algorithms, partly of course due to maintaining backward compatibility when generics were introduced, and contains a surprise or more…
For example: if a type implements both IEnumerable and IEnumerable<T> then which enumerator is selected depends on how IEnumerable's GetEnumerator is implemented:
-
if it is implicitly implemented, i.e.
IEnumerator GetEnumerator()…, then it takes priority overIEunmerable<T>'sGetEnumerator; however -
if it is explicitly implemented, i.e.
IEnumerator IEnumerator.GetEnumerator()…, then it does not.
For both implementers and consumers of such a type that might cause surprise… A Note might not go amiss.
Looking at the second issue raised, I think this is a simple spec error. The bullet point starting “If among all the types Ti…” breaks if there is only one IEnumerable<T>. If there is only one there is no need to figure out which to choose!
Suggestion:
- Otherwise, check for an enumerable interface:
- If there exists a unique
Tfor which there is an implicit conversion fromXtoIEnumerable<T>, then the collection type is the interfaceIEnumerable<T>, the enumerator type is the interfaceIEnumerator<T>, and the iteration type isT.- Otherwise, if among all the types
Tᵢfor which there is an implicit conversion fromXtoIEnumerable<Tᵢ>, there is a unique typeTsuch thatTis notdynamicand for all the otherTᵢthere is an implicit conversion fromIEnumerable<T>toIEnumerable<Tᵢ>, then the collection type is the interfaceIEnumerable<T>, the enumerator type is the interfaceIEnumerator<T>, and the iteration type isT.- As before…
At first glance, Nigel's proposal sounds reasonable. I suspect there may be more subtlety that I'm missing though. Looking forward to the discussion.
I believe the GetEnumerator cases are handled in depth by the previous bullet point ("Otherwise, determine whether the type X has an appropriate GetEnumerator method:") before reaching the part of the spec in question ("Otherwise, check for an enumerable interface:"). So we do need to deal with the first point, after already having ruled out an implicitly implemented GetEnumerator.
If I run svick's example through Nigel's suggestion, I believe it is specified to allow foreaching as string, whereas the compiler forbids foreaching. This is because the compiler does not implement the "and for all the other Tᵢ there is an implicit conversion from IEnumerable<T> to IEnumerable<Tᵢ>" bit.
The "extra subtlety" (@jskeet) is I skipped point 1 on purpose, but it does need to be looked at (@jnm2). We need to discuss how to approach this particular probable ECB ;-) before considering any actual changes.
In discussion in the meeting:
- generic interface covariance and contravariance may have impact here.
- The
dynamictype may impact the behavior as well.
It seems to me that this could be resolved by following the compiler, and specifying this in terms of the interfaces actually implemented by the type in question, not based on implicit conversions. But I don't know whether that's viable.
@svick I don't think so. Here's a type that actually implements both IEnumerable<int> and IEnumerable<dynamic>: Derived<dynamic>
class Base : IEnumerable<int>
{
IEnumerator<int> IEnumerable<int>.GetEnumerator() => throw new System.NotImplementedException();
IEnumerator IEnumerable.GetEnumerator() => throw new System.NotImplementedException();
}
class Derived<T> : Base, IEnumerable<T>
{
IEnumerator<T> IEnumerable<T>.GetEnumerator() => throw new System.NotImplementedException();
}
Let's look at what Roslyn does (possibly via testing rather than code inspection) and go from there. We suspect that instead of considering implicit conversions, it's walking the type hierarchy looking for implementations, and aim to confirm that suspicion. (And we agree with Nigel's suggestion of a note for the explicit/implicit implementation aspect.)
Assigned to @gafter to work out next steps.
I recommend we don't touch this text in v8. If we need to adjust it in v9, let's revisit it at that time.
FYI, the relevant compiler code is in https://github.com/dotnet/roslyn/blob/main/src/Compilers/CSharp/Portable/Binder/ForEachLoopBinder.cs
Punting for now - we do want to fix it, but it's not worse than before, and we don't think it should block C# 8.