csharplang
csharplang copied to clipboard
[Proposal]: Prefer spans over interfaces in overload resolution
Prefer spans over interfaces in overload resolution
- [x] Proposed
- [ ] Prototype: Not Started
- [ ] Implementation: Not Started
- [ ] Specification: Not Started
Summary
Overload resolution should prefer an overload where the corresponding parameter has a span type over an overload where the parameter has a collection interface type.
Motivation
For APIs that take a collection of items, it may be useful to provide overloads for collection interfaces and for spans. Supporting collection interfaces such as IEnumerable<T> is particularly useful for callers using LINQ; supporting spans such as ReadOnlySpan<T> is useful for callers focused on performance.
However, arrays are implicitly convertible to collection interfaces and spans, so calls with array arguments may be considered ambiguous with the current overload resolution rules.
var ia = ImmutableArray.CreateRange<int>(new[] { 1, 2, 3 }); // error: CreateRange() is ambiguous
public static class ImmutableArray
{
public static ImmutableArray<T> CreateRange<T>(IEnumerable<T> items) { ... }
public static ImmutableArray<T> CreateRange<T>(ReadOnlySpan<T> items) { ... }
}
In cases such as arrays where both overloads are applicable, overload resolution should prefer the span overload.
Detailed design
The overload resolution rules for 11.6.4.6 better conversion target could be updated with an additional rule:
Given two types
T₁andT₂,T₁is a better conversion target thanT₂if one of the following holds:
- ...
T₁isReadOnlySpan<S₁>orSpan<S₁>, andT₂isIEnumerable<S₂>orICollection<S₂>orIList<S₂>orIReadOnlyCollection<S₂>orIReadOnlyList<S₂>, and an implicit conversion exists fromS₂toS₁
Drawbacks
-
Preferring
ReadOnlySpan<T>overIEnumerable<T>in overload resolution requires choosing a user-defined conversion over an implicit reference conversion. -
For generic overloads such as
ImmutableArray.CreateRange<T>()above, an array argument is only ambiguous when the generic type argument is explicit. By contrast, when the generic type argument is inferred, method type inference is required which will ignore conversions and fail to infer the type argument forReadOnlySpan<T>.var ia = ImmutableArray.CreateRange<int>(new[] { 1, 2, 3 }); // C#11: ambiguous, C#12: ReadOnlySpan<int> var ib = ImmutableArray.CreateRange(new[] { 1, 2, 3 }); // C#11/C#12: IEnumerable<int> -
Overloads for collection interfaces and spans are still ambiguous for arrays when compiled with older compilers.
Mixing older compilers and newer TFMs is not strictly a supported scenario but it is used. Perhaps we could add a custom modifier to new overloads, so those overloads are ignored by older C# and VB compilers. What is the behavior from F#?
Alternatives
-
Use distinct method names rather than overloads.
This alternative is not ideal for the BCL which already uses overloads for scenarios such as collection construction where a
ReadOnlySpan<T>overload might be useful. And using distinct names will not work for constructors. -
Use an extension method to overload an instance method. The extension method will only be considered when the instance method is inapplicable which avoids ambiguities when both are applicable.
This alternative prevents the compiler from choosing the better overload, however. And using extension methods will not work for constructors.
Unresolved questions
- Should the updated behavior apply to array arguments only rather than all expressions that are implicitly convertible to both collection interfaces and spans?
Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2023/LDM-2023-06-19.md#prefer-spans-over-interfaces-in-overload-resolution
and an implicit conversion exists from S₂ to S₁
Definitely a bit concerned about this one. Will need to think through this and make sure it doesn't introduce any weirdness in corners.
My personal preference is that we limit this to the case where T is itself known to be an array-type.
Could we simplify this to say that when it comes to the tie breaking phase of overload resolution a ref struct is better than a non-ref struct? As a generality that seems to be what we prefer.
i assume you mean a non (ref struct) as opposed to a (non ref) struct?
I mean the tie breaking rule should prefer ref struct to non (ref-struct). It seems in most cases we'd actually want push towards ref struct
My personal preference is that we limit this to the case where
Tis itself known to be an array-type.
I feel on the fence about this.
- On the one hand, almost every other type would rather implement an interface instead of defining an implicit conversion to one, in which case type hierarchy is already higher precedence than implicit conversions. Arrays are the odd one out here, due to being special in the runtime, so a bespoke tie-breaking rule for array types makes sense.
- On the other hand, if someone defines an (implicit) conversion to {RO}Span on their type, the type author would likely really want to encourage usage through said span. Making that more automatic would be probably be more pit-of-success in most (if not all) cases.
My personal preference is that we limit this to the case where
Tis itself known to be an array-type.
It would be nice if this worked for collection literals as well, not just for arrays.
Found yet another case where this problem pops up when I was trying to push Span<T> through the compiler code base. A common pattern for diagnostics is essentially:
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, params object[] arguments);
To support ReadOnlySpan<char> based arguments for creating diagnostics my first instinct was to do the following:
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, ReadOnlySpan<char> arg);
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, params object[] arguments);
This is not sufficient because then any call site which has a string based argument is now ambiguous. It is applicable to both candidates and there is no tie breaker here.
// Error: this call is ambiguous
AddDiagnostic(diagnostics, errorCode, "example");
The only good solution I can find is using a pair of overloads:
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, ReadOnlySpan<char> arg);
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, params string[] arguments);
void AddDiagnostic(IList<Diagnostic> diagnostics, ErrorCode errorCode, params object[] arguments);
This feels rather unfortunate to have to repeat in many places and it's not very discoverable for developers that aren't familiar with overload resolution quirks. It's also another case where we hit an ambiguity and the correct answer is to pick the ref struct overload. Feel like this is the rule that we need to be leaning into.
@jaredpar based on the LDM previously (for collection expressions), we're on board with ROS<char> being better than object[] by the two-prong test. Both under "span is better than array" and "char is convertible to object". It feels, to me, that we will want to expand those rules out wider to the language as whole, regardless of using a collection-expression or not.
:)
(in other words, i think our baby steps are complimentary with what you want, and we can come up with a fully cohesive set of rules for C#13 here :)).