csharplang
csharplang copied to clipboard
[Proposal]: Let ref structs implement interfaces and substitute into type parameters (VS 17.11, .NET 9)
Ref structs implementing interfaces
- [x] Proposed
- [ ] Prototype: Not Started
- [ ] Implementation: Not Started
- [x] Specification: https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md
Related Issues
- Runtime Feature Tracking
- Related Requests
- https://github.com/dotnet/csharplang/issues/1148
- https://github.com/dotnet/runtime/issues/27229
Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-02-26.md#ref-structs-in-generics https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-06-10.md#ref-structs-implementing-interfaces-and-in-generics https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-07-22.md#ref-structs-implementing-interfaces
@MadsTorgersen @333fred Could we get this on a triage list? It feels like we've spent more hours in LDM working around this problem than we would actually fixing it.
Some comments not directly-related, but interesting to consider:
What about void as a generic parameter? It's more complicated in type system, but there's not safety problem. Variables or fields declared as void can be omitted. Comparing with ref struct, the omitted void can appear in non ref-struct types.
The unconstraint can solve the gap for concrete methods, but how about delegate types? The delegate type parameters can't be constrained, should we specialize for them?
Also, @jaredpar for lifetime issues. My thinking was that every parameter would implicitly carry a "heap-safe-to-escape" lifetime, but maybe there's something I'm not thinking of. Obviously people will request more lifetime stuff in the future to workaround that, but I figured that could be an additional set of language features that we do later.
Can you clarify an example with where T : ~box: how does it affect boxable types like classes? It feels to me like ~box allows ref structs only?
It somehow feels to me that this is not a constraint on a type but rather a contract of a callee: a promise of no-escape-to-the-heap. A lifetime feature rather than a property of a type itself.
Rather than where I think we should consider allow. There are more anti-constraints that have been proposed for the language including items like allowing pointers. Having a new keyword then a ~ for them feels unnatural where as the following feels much more natural to me:
void M<T>()
where T : IDisposable
allow T : ref struct {
}
We don't need much more than the above.
Disagree. As a ref struct can satisfy the where T : ~box proposal. That means that we have to consider all T which satisfy ~box to be potentially ref struct. The proposal doesn't seem to do this but rather focuses on the boxing aspect. That means without additional restrictions code like the following would be allowed:
class C<T> where T : ~box {
T _field;
}
C<Span<int>> // not good
That is also why I prefer allow T : ref struct because it's very clear this is the capability being allowed here and would make it much clearer why the above code fails to compile (also makes it easier for us to generate better error messages).
you just want to prevent boxing.
No. You need to prevent all behaviors that are disallowed by a ref struct. Boxing is only one of them.
No. You need to prevent all behaviors that are disallowed by a ref struct. Boxing is only one of them.
Yeah I agree with that. And I think you're right that we might want to use different syntax. But I think the basic concept is still the same here.
There are a couple of other items.
Essentially any T constrained to allow T : ref struct needs to have all capabilities and restrictions of a ref struct:
- Cannot be a field of a non
ref struct - Declarations can be marked
scoped - Is subject to lifetime tracking rules as a
ref structvalue would be - etc ...
Effectively for all intent and purposes the compiler would treat it like a ref struct value. Cause it potentially can be and that is the worst case from a restriction point of view. No harm is done to other types by being given such restrictions.
Another other item that's come up here is how such an anti-constraint should be applied to existing APIs in the core libraries. Consider for example that every variation of Func / Action should have allow T : ref struct for all their type parameters. In those cases it's a 100% upside change.
One approach to this is to simply go through and manually update every such delegate definition to have this anti-constraint. That's manual and benefits only the developers who put in the work. Another option is to simply embed it in the language rules. Essentially for a type parameter on a delegate definition where it only appears only as the type of a simple parameter (no in/out/ref modifiers) the language will automatically infer allow T : ref struct. Doing so would make the delegate more expressive than if it didn't. I've gone back and forth on this but generally lean towards automatic inferring it here. Could be talked off this point though.
Another item is that allowing ref struct to implement interfaces means we should consider allowing [UnscopedRef] attributes on interface members. Lacking that attribute there are patterns expressible in ref struct today that cannot be expressed when factored into an interface. Think that is relatively straight forward (has meaning when implemented on a struct but not on a class) but still needs to be spec'd out.
The implication of DIM on a ref struct also needs to be worked through. Consider as a concrete example:
interface I1
{
public object M() => this;
}
// Error: implicit box in the implementation of M
ref struct S : I1
{
}
Think the likely conclusion here is that ref struct cannot participate in DIM cause there isn't a reasonable way to prevent all ref safety issues that can come from it. That would require both compiler and runtime support.
Need to work out the rules for co / contra variance
On top of all of those details that @jaredpar brings up, I'd really like to be able to use this anti-constraint with Span<T> so that we can do a better job around dealing with certain InlineArray scenarios, but that api is just messy.
On top of all of those details that @jaredpar brings up, I'd really like to be able to use this anti-constraint with Span<T> so that we can do a better job around dealing with certain InlineArray scenarios, but that api is just messy.
That particular issue is a bit of a thorny problem because there are APIs in Span<T> that are illegal once the anti-constraint is added
ref struct Span<T> : allow T : ref struct {
// Error 1: ref field to ref struct
ref T data;
int length;
// Error 2: Can't use a ref struct as an array element
pulbic Span(T[] array)
}
The error 2 type of problems are very thorny. Essentially there is existing public API surface area that directly violates what the anti-constraint would provide. There isn't a general solution to this that I can envision. Think the way to move forward here would be that the compiler simply ignores them. Essentially Span<T> is a primitive and we special case certain APIs as being illegal to call when we know T is potentially a ref struct. That would classify as hard on my brain but solvable.
The error 1 type of problems are more fundamental. There is a proposal out for allowing ref fields of ref struct but it's a decently complex feature.
I think the notion of anti-constraints needs to be considered very thoroughly. It's almost a full on massive change all on its own. What do ~class, ~struct, ~new, etc. mean? ~notnull becomes weird, because it's a double negative. What about types (classes and interfaces)? I think I saw that brought up before, but I don't remember the discussion and why it was desirable.
The proposed allow keyword is also a little finicky. Does allow only allow (pun intended) ref struct? Are there any potential other scenarios? Why not just add ref struct to the list of where - surely adding a new keyword here is overkill?
I think the notion of anti-constraints needs to be considered very thoroughly
I don't think they're intended to be a general purpose feature. I think they're only to be used where they enable the language to support additional functionality, but they also don't make sense as a normal constraint. A ref struct constraint doesn't force the generic type argument to be a ref struct, it forces the code of the method to follow the rules as if it could be a ref struct, but the argument could be a normal struct (or even a class?)
Put up a full proposal for this feature
https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md
Another alternative that was mentioned in the old issue is placing the ref keyword at the generic declaration site, much like in and out on co/contravariant interfaces.
public void M<ref T>(T o) where T : IDisposable
=> o.Dispose();
My main issue there @timcassell is that reads to me like "this must be a ref struct" as opposed to "this is allowed to be a ref-struct". That said, we'll def consider syntactic possibilities here when designing this out.
What about where T : allow ref struct, ISomeInterface? It's kind of strange to have 2 places with constraints and "expanders" for one generic argument.
The syntax here is the least interesting part :) We'll likely consider a bunch and settle on one the group feels conveys the idea the best and feels the most c#-y :)
@timcassell
Another alternative that was mentioned in the old issue is placing the ref keyword at the generic declaration site, much like in and out on co/contravariant interfaces.
For me that is much too easily confused with supporting the ability to have ref as a type modifier. For example List<ref int>. That is a much different feature (and not one that I think anyone is considering given the fundamental change it would require to the runtime).
What about where T : allow ref struct, ISomeInterface? It's kind of strange to have 2 places with constraints and "expanders" for one generic argument.
I agree it's a bit strange to have both allow and where but unfortunately there need to be up and down modifiers here. That is because a generic parameter today has implicit constraints (can't be pointer, ref struct, etc ...). The syntax must provide a method to remove the implicit constraints (tallow) and add more (where).
As @CyrusNajmabadi mentioned though the syntax is the least interesting part. It is also likely the part we will end up debating the most in LDM. For now I would encourage people to focus on the behaviors and ensure they satisfy the scenarios you want to get out of this feature.
The funny thing that having read the proposal and thought about it, it kind of makes sense to me so the only remaining question that I have is the syntax :) It's just not "what looks nice", but potential future enhancements. I think we should all imagine what allow can allow (pun intended) in the future, if anything; that IMO should be the main factor to decide the syntax.
How will calling methods on System.Object like ToString() and GetHashCode() work? From what I understand, if a struct overrides those methods, they will not be boxed when called. But if they don't override them, they will be boxed. Will calling those methods simply be disallowed? Or that would be allowed in this case since the box is short-lived and doesn't escape?
@TahirAhmadov I thought the same. But I don't really like "allow" syntax. I'd rather think of this not as "allow" but as "restrict the lifetime of values of certain type T".
And I would like to see some work in this direction. E.g. if I didn't want some IDisposable like native resource handle to escape I would use similar mechanism.
This can lay groundwork for more general lifetime and ownership features. I think it's important to think of them when designing syntax/rules.
While the idea to have <ref T> instead of allow came from me, I think ultimately allow is the best and future-consistent approach. In the end we could have allow T : static, allow T : * (pointers), allow T : ref (ref to anything) etc. if needed, so this syntax could prepare the ground for it. Indeed <ref T> does sound like an even stronger anti-constraint (allow all byrefs and byref-likes).
Will this also allow us to use allow pattern outside the ref struct scenarios?
For example,
T ParseValue<T>(string str) where T : IParseable<T> allow T : string
{
if (typeof(T) == typeof(string))
{
return (T)str;
}
return T.Parse(str);
}
@hez2010
Will this also allow us to use
allowpattern outside theref structscenarios?
Not without said features being explicitly designed and implemented.
So I just got news of this proposal. There's a major problem here: "adding instance default interface methods to existing interfaces becomes universally source+binary breaking". We depend on the ability to add default interface methods all over the place.
I'm reading this and going "really?" This breaks the primary motivating feature to have default interface methods.
I did come to an idea of how to fix it but this is probably bonkers hard: when you encounter a missing default interface method on a ref struct; jit the mehod and bail if the jit would emit a box operation.
This breaks the primary motivating feature to have default interface methods.
Default interface methods have always had edge cases where they do not work. For example the diamond problem where the runtime cannot pick the best DIM member results in runtime exceptions. This is just another case where DIM will have an edge case.
when you encounter a missing default interface method on a ref struct; jit the mehod and bail if the jit would emit a box operation.
I suspect that would not be very effective in practice. Most of the DIM members I've seen rely on calling other available members in the type. That implicitly uses this which forces a boxing operation on the value. This would really only work for DIM members that call into static members only and do not use this. Not sure that is a large amount.
"That implicitly uses this which forces a boxing operation on the value.": My testing is that doesn't box a struct. The jitter is smarter than that.
My testing is that doesn't box a struct. The jitter is smarter than that.
The assertion of the runtime team is that it does and that it breaks ref struct semantics.
How does that not break default implementations on normal mutable structs? (As in, the default implementation generates nonfunctional code because it mutates a copy)?
It does break them:
using System;
var s = new S();
Mut(ref s);
// Outputs '0'
Console.WriteLine(s.V);
static void Mut<T>(ref T t) where T : IMutate
{
t.Mutate();
}
interface IMutate
{
int V { get; set; }
void Mutate() { V++; }
}
struct S : IMutate
{
public int V { get; set; }
// Uncomment to print '1' instead
// public void Mutate() { V++; }
}
https://sharplab.io/#v2:C4LgTgrgdgPgAgJgIwFgBQ6BuBDMACAZzwF48oBTAdzwGUAKASgG50BZCYOscgM0OfToA9ELwB5DgAcORAOQAGWejhIAnHQIA6AGoCMaFQDY8cACx52wADwAVAHxdeeG3mAM8lABbluzvCDwASUtsYHJ0AG90PBjXTRCwxhY0AF9BfQBLKDCwHmwAY3IghPC0KLRYvCzgPG08CLwAc3JgJkIWtrSK2LMLDlDyRnragGoRzujY9C70AmBIfJqaf2L+sMjJmLgAZirs2uHm1vbjrsqRPABVKHyAewBbe/J94Fu8STBqvFkkWT258jYAAmmzwFx2JnMJSGDW0YwmqXQQA==
How to solve "Tuple<T1, T2> should be ref struct when either of T1 or T2 is struct"?
It may require some specification work. Will a new RefTuple<T1, T2> type be added to workaround this?