csharpstandard icon indicating copy to clipboard operation
csharpstandard copied to clipboard

Foreach processing for types implementing IEnumerable<T> is problematic

Open svick opened this issue 1 year ago • 13 comments

§13.9.5 The foreach statement of the current draft-v9 says that when processing foreach, and after not finding a GetEnumerable method (in practice, this happens with explicit interface implementation), we do the following:

If among all the types Tᵢ for which there is an implicit conversion from X to IEnumerable<Tᵢ>, there is a unique type T such that T is not dynamic and for all the other Tᵢ there is an implicit conversion from IEnumerable<T> to IEnumerable<Tᵢ>, then the collection type is the interface IEnumerable<T>, the enumerator type is the interface IEnumerator<T>, and the iteration type is T.

I think relying on implicit conversions here is problematic. Also, it's not what Roslyn actually does, resulting in differences between the specification and the implementations.

In particular:

  1. The specification allows implementing IEnumerable<T> multiple times and directly using such type in foreach, when one of the type parameters inherits from the other:

    foreach (var x in new MyCollection())
    {
    }
    
    class MyCollection : IEnumerable<string>, IEnumerable<object>
    {
        IEnumerator IEnumerable.GetEnumerator() => throw new NotImplementedException();
    
        IEnumerator<string> IEnumerable<string>.GetEnumerator() => throw new NotImplementedException();
    
        IEnumerator<object> IEnumerable<object>.GetEnumerator() => throw new NotImplementedException();
    }
    

    According to the spec, the unique type T here should be string, since there is an implicit covariant conversion from IEnumerable<string> to IEnumerable<object>. The compiler does not allow this code:

    error CS1640:: foreach statement cannot operate on variables of type 'MyCollection' because it implements multiple instantiations of 'IEnumerable<T>'; try casting to a specific interface instantiation

  2. The specification results in the wrong iteration type for a type that implements IEnumerable<dynamic>:

     foreach (var x in new MyCollection<dynamic>())
     {
     }
    
     class MyCollection<T> : IEnumerable<T>
     {
         IEnumerator<T> IEnumerable<T>.GetEnumerator() => throw new NotImplementedException();
    
         IEnumerator IEnumerable.GetEnumerator() => throw new NotImplementedException();
     }
    

    According to the spec, the unique type T here should be object, since the type is implicitly convertible to both IEnumerable<object> and IEnumerable<dynamic>, but dynamic is explicitly forbidden. The compiler determines the iteration type to be dynamic, as expected.

It seems to me that this could be resolved by following the compiler, and specifying this in terms of the interfaces actually implemented by the type in question, not based on implicit conversions. But I don't know whether that's viable.

svick avatar Oct 16 '24 13:10 svick

The specific commit that is linked as "current for v9" has not appeared in the v9 branch, but this would be relevant work for v8 which is still in draft.

For your first point, I'm curious how much interest there would be in making this a compiler bug. I can see the rationale in allowing iteration here under the assumption that there is really no confusion about "which sequence" you're getting. The more-derived sequence should be the same instances as the less-derived. If there is any confusion, that's poor type authoring.

This text is word for word the same as the v5 spec, but the compiler implements the behavior that matches the v4 spec. Like, the spec was changed in v5 here, but the compiler seems to have ignored this change. v4 text:

  • If there is exactly one type T such that there is an implicit conversion from X to the interface System.Collections.Generic.IEnumerable<T>, then the collection type is this interface, the enumerator type is the interface System.Collections.Generic.IEnumerator<T>, and the element type is T.

For your second point, it certainly seems the compiler's behavior is the desirable one. It's not actually possible to implement IEnumerable<dynamic> directly, so this would always be a case involving substituted type parameters.

This raises further questions for me. Let's say we fix dynamic, but then we still have the issue of being convertible to both IEnumerable<(int A, int B)> and IEnumerable<(int, int)> simultaneously. Or IEnumerable<string> and IEnumerable<string?>. Or all the ways that this combines with nesting:

  • IEnumerable<((int A, dynamic B), int C)>
  • IEnumerable<((int A, object B), int C)>
  • IEnumerable<((int A, dynamic? B), int C)>
  • IEnumerable<((int A, object? B), int C)>
  • IEnumerable<((int, dynamic), int)>
  • IEnumerable<((int, object), int)>
  • IEnumerable<((int, dynamic?), int)>
  • IEnumerable<((int, object?), int)>
  • and many more permutations I've left out!

It's an interesting suggestion to talk about interfaces implemented by the type instead of conversions. I'm curious what others think who are more familiar with what this change would entail. If I'm following, the benefit of this change would come from saying that IEnumerable<object> would not be considered to be implemented by the type in your example, but only IEnumerable<dynamic>.

jnm2 avatar Jan 02 '25 19:01 jnm2

I raised https://github.com/dotnet/roslyn/issues/76598 making the case for following the v5 (to present) spec on point 1.

jnm2 avatar Jan 02 '25 20:01 jnm2

Response from the compiler is that it's been a decade with next to no customer complaints, so perhaps the best course of action is to change the spec, as @svick originally suggested.

jnm2 avatar Jan 07 '25 00:01 jnm2

The quoted text from draft-v9 is the same in v7 and draft-v8 so this is a current issue.

This is not the most elegant of algorithms, partly of course due to maintaining backward compatibility when generics were introduced, and contains a surprise or more…

For example: if a type implements both IEnumerable and IEnumerable<T> then which enumerator is selected depends on how IEnumerable's GetEnumerator is implemented:

  • if it is implicitly implemented, i.e. IEnumerator GetEnumerator()…, then it takes priority over IEunmerable<T>'s GetEnumerator; however

  • if it is explicitly implemented, i.e. IEnumerator IEnumerator.GetEnumerator()…, then it does not.

For both implementers and consumers of such a type that might cause surprise… A Note might not go amiss.

Looking at the second issue raised, I think this is a simple spec error. The bullet point starting “If among all the types Ti…” breaks if there is only one IEnumerable<T>. If there is only one there is no need to figure out which to choose!

Suggestion:

  • Otherwise, check for an enumerable interface:
    • If there exists a unique T for which there is an implicit conversion from X to IEnumerable<T>, then the collection type is the interface IEnumerable<T>, the enumerator type is the interface IEnumerator<T>, and the iteration type is T.
    • Otherwise, if among all the types Tᵢ for which there is an implicit conversion from X to IEnumerable<Tᵢ>, there is a unique type T such that T is not dynamic and for all the other Tᵢ there is an implicit conversion from IEnumerable<T> to IEnumerable<Tᵢ>, then the collection type is the interface IEnumerable<T>, the enumerator type is the interface IEnumerator<T>, and the iteration type is T.
    • As before…

Nigel-Ecma avatar Jan 21 '25 02:01 Nigel-Ecma

At first glance, Nigel's proposal sounds reasonable. I suspect there may be more subtlety that I'm missing though. Looking forward to the discussion.

jskeet avatar Jan 21 '25 09:01 jskeet

I believe the GetEnumerator cases are handled in depth by the previous bullet point ("Otherwise, determine whether the type X has an appropriate GetEnumerator method:") before reaching the part of the spec in question ("Otherwise, check for an enumerable interface:"). So we do need to deal with the first point, after already having ruled out an implicitly implemented GetEnumerator.

jnm2 avatar Jan 21 '25 16:01 jnm2

If I run svick's example through Nigel's suggestion, I believe it is specified to allow foreaching as string, whereas the compiler forbids foreaching. This is because the compiler does not implement the "and for all the other Tᵢ there is an implicit conversion from IEnumerable<T> to IEnumerable<Tᵢ>" bit.

jnm2 avatar Jan 21 '25 17:01 jnm2

The "extra subtlety" (@jskeet) is I skipped point 1 on purpose, but it does need to be looked at (@jnm2). We need to discuss how to approach this particular probable ECB ;-) before considering any actual changes.

Nigel-Ecma avatar Jan 22 '25 08:01 Nigel-Ecma

In discussion in the meeting:

  • generic interface covariance and contravariance may have impact here.
  • The dynamic type may impact the behavior as well.

BillWagner avatar Jan 22 '25 20:01 BillWagner

It seems to me that this could be resolved by following the compiler, and specifying this in terms of the interfaces actually implemented by the type in question, not based on implicit conversions. But I don't know whether that's viable.

@svick I don't think so. Here's a type that actually implements both IEnumerable<int> and IEnumerable<dynamic>: Derived<dynamic>

class Base : IEnumerable<int>
{
    IEnumerator<int> IEnumerable<int>.GetEnumerator() => throw new System.NotImplementedException();

    IEnumerator IEnumerable.GetEnumerator() => throw new System.NotImplementedException();
}

class Derived<T> : Base, IEnumerable<T>
{
    IEnumerator<T> IEnumerable<T>.GetEnumerator() => throw new System.NotImplementedException();
}

jnm2 avatar Jan 22 '25 20:01 jnm2

Let's look at what Roslyn does (possibly via testing rather than code inspection) and go from there. We suspect that instead of considering implicit conversions, it's walking the type hierarchy looking for implementations, and aim to confirm that suspicion. (And we agree with Nigel's suggestion of a note for the explicit/implicit implementation aspect.)

Assigned to @gafter to work out next steps.

jskeet avatar Jan 22 '25 20:01 jskeet

I recommend we don't touch this text in v8. If we need to adjust it in v9, let's revisit it at that time.

FYI, the relevant compiler code is in https://github.com/dotnet/roslyn/blob/main/src/Compilers/CSharp/Portable/Binder/ForEachLoopBinder.cs

gafter avatar Jul 30 '25 00:07 gafter

Punting for now - we do want to fix it, but it's not worse than before, and we don't think it should block C# 8.

jskeet avatar Sep 03 '25 20:09 jskeet