csharplang icon indicating copy to clipboard operation
csharplang copied to clipboard

Extension lowering

Open MadsTorgersen opened this issue 1 year ago • 48 comments

Lowering of extensions

Extensions are "transparent wrappers", that allow types to be augmented with additional members and (eventually) interfaces.

This outlines how we can implement extensions by lowering them to structs, and applying an erasure approach to signatures and generic instantiation.

In this document, base extensions and interfaces are ignored, except for brief consideration at the end.

Extension declarations

An extension declaration like this:

public extension E for U
{
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... this ... }
    public string this[int i] { ... this ... }
}

is lowered to a struct declaration, which contains a private field of the underlying type, as well as the function members from the extension declaration, modified to indirect through the field as necessary:

public struct E
{
    private U __this;
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... __this ... }
    public string this[int i] { ... __this ... }
}

In addition an attribute or other marker may be used to designate that the struct represents an extension.

Extensions in generic instantiations

Extensions used as type arguments or array element types are erased to their underlying type:

List<E>

is lowered to

List<U>

This is the main mechanism ensuring that extensions and their underlying types are interchangeable, even through generic instantiation. In this way, a collection of the underlying type can be freely reinterpreted in terms of an extension, and the elements thus gain the extra members afforded by the extension.

Within member bodies, the compiler can keep track of the fact that a List<U> is "really" a List<E>, and provide appropriate conversions.

Extensions in signatures

Extensions are erased from signatures, and are instead communicated through metadata (attributes).

public E M(E e, E[] a, List<E> l) { ... }

is lowered to something like

public U M([Extension(...)] U e, [Extension(...)] U[] a, [Extension(...)] List<U> l) { ... }

The exact encoding scheme for the attributes is TBD, but will probably resemble those for nullable reference types and tuple element names, which are similarly type elements that are erased by the compiler.

This encoding scheme means that methods cannot be overloaded by different extensions for the same underlying type.

Extension member access

In order to provide access to extension members, the compiler is able to freely convert between the extension type and its underlying type.

This conversion relies on the fact that an extension always has exactly the same shape in memory as the underlying type. This makes it safe for the compiler to utilize the Unsafe.As(...) method.

The method

void M(E e)
{
    var i = e.M();
}

is lowered to

void M([Extension(...)] U e)
{
    ref var __e = Unsafe.As<U, E>(ref e); // Creates a ref of type E to e
    var i = __e.M();
}

Conversions to and from extensions

The bi-directional implicit identity conversion between extensions and their underlying type can likewise be supported through the use of Unsafe.As:

U u = ...;
E e = u; // identity conversion
e.M();

which lowers to

U u = ...;
E e = Unsafe.As<U, E>(ref u);
e.M();

If the underlying type is a value type, the user may need to make e a ref to u, so that the value isn't copied in the conversion, and mutations occurring within extension members apply back to the underlying value:

U u = ...;
ref E e = ref u; // identity conversion
e.M(); // Mutations apply to u

which lowers to

U u = ...;
ref E e = ref Unsafe.As<U, E>(ref u); // identity conversion
e.M(); // Mutations apply to u

Implicit extensions

Implicit extensions are automatically used as "fallback" extensions for their underlying type in a given static scope. The compiler uses lookup machinery similar to today's extension methods to find where extension members are invoked, and implicitly inserts a conversion to the appropriate extension type. This is described in detail in the Extensions specification.

From there, the lowering proceeds as described above.

Base extensions

It is TBD how base extensions are encoded in the lowered extension declaration. When converting to base extensions, the compiler can use the same approach as between extensions and underlying types, since the underlying representation in memory is unchanged.

Extensions with interfaces

The lowering of an extension declaration which implements interfaces is in and of itself simple: Simply add the interfaces to the lowered struct declaration.

However, semantics become complicated. In particular, the implicit conversions between generic instantiations over extensions vs underlying types may no longer apply when the extension implement interfaces, and those interfaces are material to satisfying constraints in the generic instantiation itself.

A compiler-only approach to this would be to simply not have such conversions when an extension implements an interface, but that means adding an interface to an extension is highly breaking. A more permissive and situational approach would likely require significant new runtime feature work however.

MadsTorgersen avatar Dec 16 '23 00:12 MadsTorgersen

Hey!

Should extension be translated into a ref struct instead?

public extension E for U
{
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... this ... }
    public string this[int i] { ... this ... }
}

equivalent to:

public ref struct E
{
    private ref U __this;
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... __this ... }
    public string this[int i] { ... __this ... }
}

KyouyamaKazusa0805 avatar Dec 16 '23 03:12 KyouyamaKazusa0805

This encoding scheme means that methods cannot be overloaded by different extensions for the same underlying type.

That means you can't do something like this, which is unfortunate:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? Load(TeacherId id)
{
    return dbContext.Teachers.FirstOrDefault(x => x.Id == id);
}

public Person? Load(StudentId id)
{
    return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

Instead you'd have to do this:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? LoadTeacher(TeacherId id)
{
    return dbContext.Teachers.FirstOrDefault(x => x.Id == id);
}

public Person? LoadStudent(StudentId id)
{
    return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

(I'm also assuming public extension X for A in this context is an explicit extension - meaning explicit is implicit, and you have to explicitly state if you want it to be an implicit extension - something I agree with, but can be confusing to developers at first glance).

The thing I think/thought I will be using explicit extensions for the most is for typed values like this, where I want something to be functionally identical to a backing type, but has a different meaning in my domain. Currently I do this by making something like this, but that has a lot of problems - especially with IO (Serialization/Deserialization)

public readonly record struct PersonId(Guid Value)
{
    public override string ToString()
    {
        return Value.ToString();
    }

    // Override `Parse`, `TryParse`, and add some Json serialization attributes, and so on...
}

KennethHoff avatar Dec 16 '23 18:12 KennethHoff

I really wonder how this can be properly done without runtime support. For example:

public void Do<T, U>(Dictionary<T, U> dict) where T : ISomeInterface1 where U : ISomeInterface2
{
    typeof(T/U) // - what is it here? Is it int or a wrapper?
    typeof(T/U).GetMethod("MyMethodFromInterface") // - and how will this work?
    dict[default(T)] = default(U); // - what happens here if Resize is called for example? From runtime point of view Wrapper[] and int[] are different types.
}
// extend int with ISomeInterface1 and ISomeInterface2
Do(new Dictionary<int, int>());

En3Tho avatar Dec 16 '23 18:12 En3Tho

Adding to what @En3Tho and I said yesterday. I consider being unable to overload based on extensions to be a fairly huge problem, so I really hope this will be considered when working out the implementation details.

I think it's time to consider updating the IL metadata. Alongside this issue (extensions) - which I presume would be able to bypass the "According to IL it is the underlying type, so you can't overload based on them" problem with an updated IL representation - there are a few other language features that I believe would benefit from an update to the IL. https://github.com/dotnet/runtime/issues/89730 being the first one that came to mind, but there are a few others that might, like https://github.com/dotnet/runtime/issues/94620 and https://github.com/dotnet/csharplang/issues/5556, and I'm sure many more.

I can't really think of an example of explicit extensions that don't fit into the use-case mentioned above¹ (https://github.com/dotnet/csharplang/issues/7771#issuecomment-1858903279), and because of that I think the conceptual model of "TeacherId is a type that coincidentally have the exact same underlying data-structure as a Guid" is more useful than "TeacherId is a Guid with a different compiler-enforced name" (Which seemingly is what this lowering strategy will give us). While I'm sure the latter has some use-cases, I consider the former to be much more useful.

Personally - and this might be taking it to the extreme - I would want to be able to disable implicit upcasting of explicit extensions (say that three times fast..), such that the following code would be illegal (I believe it currently is legal):

public extension TeacherId for Guid;

public Person? LoadPerson(Guid id)
{
	return dbContext.Persons.FirstOrDefault(x => x.Id == id);
}

TeacherId teacherId = ...;
LoadPerson(teacherId); // Personally I want this to error.

Guid guid = (Guid)teacherId; // Explicit casting should be allowed though.
LoadPerson(guid);

I realize this can be achieved through Roslyn Analyzers, but I want the language designers to be aware of this use-case.

Additionally, I hope the following is illegal by default, because if it isn't, then this feature is practically worthless for my use-case:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? LoadPerson(Guid id)
{
	return dbContext.Persons.FirstOrDefault(x => x.Id == id);
}

public Person? LoadStudent(StudentId id)
{
	return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

TeacherId teacherId = ...;
LoadStudent(teacherId); // No "sideways casting" allowed.

Guid guid = ...;
LoadStudent(guid); // No downcasting allowed.

¹ I am in no way saying there aren't any, just that I can't think of any.

KennethHoff avatar Dec 17 '23 17:12 KennethHoff

I consider being unable to overload based on extensions to be a fairly huge problem

At that point, just wrap the type manually with a new nominal type. This can also be done easily and cheaply

The point of an extension is that it's new functionality, but the same type (just like an extension method). It's just broadening that to all members.

If you actually want me types, that's the easy thing that is already supported today :-)

CyrusNajmabadi avatar Dec 17 '23 17:12 CyrusNajmabadi

I consider being unable to overload based on extensions to be a fairly huge problem

At that point, just wrap the type manually with a new nominal type. This can also be done easily and cheaply

The point is that that brings a lot of baggage, like requiring custom JsonSerializers, ToString overloads, Value Converters (EF Core), IParseable implementations, and so on..

The point of an extension is that it's new functionality, but the same type (just like an extension method). It's just broadening that to all members.

I agree that implicit extensions offers a lot of new useful functionality in this regard, but explicit extensions seemingly doesn't (But I'd love to be proven wrong)

KennethHoff avatar Dec 17 '23 17:12 KennethHoff

The point is that that brings a lot of baggage, like requiring custom JsonSerializers, ToString overloads, Value Converters (EF Core), IParseable implementations, and so on..

All that could be done with generators. :-)

If you want distinct types, that's the easy case. The challenge has always been in adding functionality to existing types.

CyrusNajmabadi avatar Dec 17 '23 17:12 CyrusNajmabadi

but explicit extensions seemingly doesn't (But I'd love to be proven wrong)

It definitely does. It just doesn't allow things like overloading. But so what? Just name the methods differently. :-)

CyrusNajmabadi avatar Dec 17 '23 17:12 CyrusNajmabadi

It definitely does. It just doesn't allow things like overloading. But so what? Just name the methods differently. :-)

Assuming you still need nominal types to do real differentiation, then I fail to see what explicit extensions adds to my repertoire that my previous solution and/or implicit extensions doesn't.

KennethHoff avatar Dec 17 '23 17:12 KennethHoff

As extensions are themselves nominal types, it seems unfortunate to not be able to treat them as nominal types and to do stuff like overloading on them. Not being able to do so I think makes it awkward to use explicit extensions as a mechanism for type-safe aliases. Having to use two distinct names and cast to the explicit extension seems quite redundant:

foo.LoadStudent((StudentId) id);
foo.LoadPerson((PersonId) id);

The above proposal doesn't mention implicit/explicit extensions at all, either because that aspect of extensions remains up in the air or because it's assumed that it would not impact encoding. I'd like to posit the argument that explicit extensions should further blur the lines between aliases and types, and the encoding should bare that out. Otherwise, I don't understand the purpose of having the extension be a nominal type at all, or to allow types to be declared as that extension in signatures of generic type arguments.

HaloFour avatar Dec 17 '23 19:12 HaloFour

This discussion has me convinced that people want two different features and are calling both of them "extensions."

If you want the thing called "type classes", then the concept of being able to "overload" on extension in non-sensical. The definition of a type class is to define an interface or signature and then define how a type conforms to that interface. Giving a name to that particular implementation has some meaning, although now you are defining something closer to ML modules, but defining a new type is nonsensical. An implementation is, by definition, not a type. A type class is a type (or constraint bound). To say that a consumer of a type class can overload on implementation definitions is as nonsensical as saying that a method taking an interface gets to decide which interface implementation gets chosen. The whole point of the abstraction is that the caller has the flexibility, not the callee.

If a new type is defined, then the extensions feature isn't really about type classes at all, it's about subtyping. That is, an extension is really just a new type that is inheriting the components of the base type.

While we could have a single base syntax that allows for both features, I'm not sure that's a good idea.

agocke avatar Dec 17 '23 20:12 agocke

I think the term extension make little sense for the concept that I was describing earlier in this conversation. It's not extending anything, it's just repackaging something else, giving it a new name and pretending like they are not the same thing.

In my domain, TeacherId, StudentId, and Guid should never be used interchangeably. It is, for the purpose of my domain, a coincident that they have the same underlying implementation.

I don't think anybody would call what I just described as "an extension to Guid", as I'm more likely to change the for Guid part than the extension StudentId part in a refactor.

I don't know what I would call what I want to achieve however¹, but extension ain't it.

Still confused as to what an explicit extension is though - if the use-case I've described is invalid - except for having yet another level of obfuscation; Needing both a using directive and a cast operator to use it.

¹ Type classes? @agocke mentioned that, but my googling gave me the most abstract answer I've ever seen, so I've no idea if that's it or not.

KennethHoff avatar Dec 17 '23 21:12 KennethHoff

@agocke

If these aren't intended to be actual types, it also makes absolutely no sense to be able to define variables or parameters of them either, yet that's exactly how you're expected to use explicit extensions. Given the enormous amount of overlap I don't think it makes any sense to make them two completely different language features either. That feels like making them separate features for the virtue of adhering to some strict textbook definition of "type class" rather than providing solutions to solve problems.

HaloFour avatar Dec 17 '23 21:12 HaloFour

I think the problem is that "explicit extensions" are already a gray area.

I'm not super versed on typeclasses, what little understanding I have of them is through Scala which probably takes some liberties in order to make them make sense in the JVM. In Scala, the given implementation or witness is not a nominal type and thus can't be used as a nominal type. As long as the given implementation is in scope, the members of the typeclass are available on the underlying type directly, and via generics you'd only ever reference the witness implementation through the typeclass itself. This also appears to be the case with Rust traits and Swift protocols, where the implementation is either only named for the sake of scoping, or not named at all, respectively.

Explicit extensions feel like something completely different in that the witness is a nominal type that you can use as a type in local variables and method signatures. If they aren't a part of the type system, I would argue that it doesn't make sense to allow that. If it's unpalatable to make them a part of the type system because they're called "extensions", then I would argue that maybe we should change "implicit extension" to just "extension" and "explicit extension" to "type".

The recent changes to aliasing in C# generated a lot of feedback from people wanting a way to define strict aliases in the language that behaved like types. It feels like extensions gets us something like 95% of the way there, both in terms of syntax and capabilities.

HaloFour avatar Dec 18 '23 17:12 HaloFour

I have to say that I find this confusing: struct E { private U __this; } why wrap then erase? that looks like a "newtype" imo. Is it only to track the type throughout the code? I can imagine the compiler could do that without actually emitting a wrapper type.

This issue is already apparent in the declaration: public extension TeacherId for Guid;. That example is probably best represented by something like type TeacherId = Guid;, because no actual "extension" is being defined. For me extension is like a container for extension members on a particular type aka "extension everything" which would be lowered to simple static methods, and so it wouldn't be useful to be empty.

Aside: even with that definition I'm not sure how the extension name would be used, e.g. how to call an instance extension property where it would be otherwise ambiguous? The fact that it can also implement interfaces is where things get conflated in my mind. I'm looking at impl .. for declarations in Rust, those would never need a name because it just adds a "conversion".

alrz avatar Dec 19 '23 11:12 alrz

that allow types to be augmented with additional members and (eventually) interfaces.

Is the "for" syntax a placeholder? It feels like it wouldn't be too much hard to just reuse the existing base class syntax i.e.

public extension MyExt : ThingImExtending

this is also later expandable to interfaces without it looking alien i.e.

public extension MyExt : ThingImExtending, IMyExtraThingImImplementing

which in my opinion looks more familiar than this

public extension MyExt for ThingImExtending : IMyExtraThingImImplementing

as this could be read as "Implement MyExt for ThingImExtending where ThingImExtending implements IMyExtraThingImImplementing" which, while this isn't a concept expressible now or in the near future in the C# language, from the perspective of someone who spends a lot of time in Rust, this is a mistake that could be made from other languages that have the "impl ... for" pattern and will definitely be confusing once C# has a richer type system more akin to these languages.

Obviously I recognise this isn't a work item now, but some agreement on how this could be done in the future is required to ensure the syntax doesn't clash or create confusion with future features.

Perksey avatar Dec 20 '23 21:12 Perksey

@Perksey all syntax is a strawman placeholder for now.

CyrusNajmabadi avatar Dec 20 '23 21:12 CyrusNajmabadi

@Perksey The reason for the for infix is because you can extend extensions, like this:


public extension BaseExt for Guid;
public extension SuperExt for Guid : BaseExt;

Add in interfaces and they're more confusing


public extension BaseExt for Guid : IFirst;
public extension SuperExt for Guid : BaseExt, ISecond;

Add in partial implementation and they're even more confusing.


public extension BaseExt for Guid;
public partial extension SuperExt for Guid : BaseExt, ISecond;
public partial extension SuperExt for Guid : IThird; // I believe you're allowed to omit the base-class in partial declarations.

How would this third case work without the for keyword?

public extension BaseExt : Guid;
public partial extension SuperExt : Guid, BaseExt, ISecond; // Is this implementing an interface `BaseExt`?
public partial extension SuperExt : IThird; // Is this extending `IThird` all of a sudden? I thought it was extending `Guid`?

.. but it is as @CyrusNajmabadi said; Syntax is secondary, but this is the reason for the current choice, and for is an existing keyword (so no breaking change) and it "coincidentally" also sounds nice.

KennethHoff avatar Dec 20 '23 21:12 KennethHoff

public extension SuperExt for Guid : BaseExt;

Surely the SuperExt will be for the type of the BaseExt? I don't see a world in which you can extend an extension for a type for which the original extension isn't extending.

How would this third case work without the for keyword?

Likely the same way that partial works today, provided my above comment rings true.

Perksey avatar Dec 20 '23 21:12 Perksey

Surely the SuperExt will be for the type of the BaseExt

Definitely not:

Consider something as simple as:

extension ObjectExtensions for object { ... }

Then you do:

extension StringExtensions for string : ObjectExtensions

I would commonly expect this sort of extension hierarchy to occur.

CyrusNajmabadi avatar Dec 20 '23 21:12 CyrusNajmabadi

Ahhhhhhh damn, foiled by OOP again!

Perksey avatar Dec 20 '23 21:12 Perksey

Perhaps

public extension<string> StringExtensions : ObjectExtensions

as this reads as "public extension of string extending the ObjectExtensions extension". This is noisier, but is also akin to the existing type declaration syntax.

The only downside with this one is that it looks extremely strange with generics in the loop.

The main reason I'm interested in the syntax is that I have absolutely no doubt that in its current form, most of the discussions around this proposal will be around the intricacies of how it's lowered and expressed whereas I only care about how it looks in my code and how I use it, in its current form with its current constraints I don't see a world in which I'll be surprised with whatever functional decisions are made. I've been surprised before, but this is why I'm asking about the syntax (which is sort of like the final product) rather than the lowering itself (which are sort of the steps along the way)

Perksey avatar Dec 20 '23 21:12 Perksey

@Perksey https://github.com/dotnet/csharplang/issues/7771#issuecomment-1865143979

whereas I only care about how it looks in my code and how I use it

Hammering out syntax will come after we're happy with semantics.

rather than the lowering itself

That's literally this discussion though :)

CyrusNajmabadi avatar Dec 20 '23 21:12 CyrusNajmabadi

@Perksey

most of the discussions around this proposal will be around the intricacies of how it's lowered

That is what this thread is about after all. Syntax-related discussions feels like it should be in https://github.com/dotnet/csharplang/issues/5497 (or maybe a separate issue? Discussion?)

KennethHoff avatar Dec 20 '23 21:12 KennethHoff

Ah missed that one (just went straight to the latest issue), thanks.

Perksey avatar Dec 20 '23 21:12 Perksey

@HaloFour You may not be very well versed in type classes but you summarized fairly well.

Let me give a quick run down of what problem they're meant to solve, and how.

Type classes where created to help with "ad-hoc polymorphism". Ad-hoc polymorphism was defined as follows by Wadler[1]:

Ad-hoc polymorphism occurs when a function is defined over several different types, acting in a different way for each type. A typical example is overloaded multiplication: the same symbol may be used to denote multiplication of integers (as in 3*3) and multiplication of floating point values (as in 3.14*3.14)

One historical problem with ad-hoc polymorphism solutions is that they don't integrate well with parametric polymorphism (generics). You can see this is pre-generic-math C# where multiplication couldn't be used on generic type parameters because multiplication was not defined for anything that could be used as a constraint. Even now, if you were to try to define a new operation for Int32, you couldn't do it because that functionality would have to be added to the definition of Int32.

Existing ad-hoc polymorphism solutions also don't interact well with generic types. Let's say you wanted to implement deep-equals functionality for C#. You would start by defining an interface

interface IDeepEquatable<in T> {
    bool DeepEquals(T t);
}

Now let's say you want to define it for List<int>. First you'll need to implement IDeepEquatable for int. Let's say you can get that into the framework. That still leaves List<T>. As it is, you can't implement IDeepEquatable without knowing that T implements IDeepEquatable. But if you add the constraint to T, you can't create Lists of non-IDeepEquatable types, which is not correct.

So you need some mechanism for "conditional implementation." Type classes is one such option. And it appears that wrapping could be one such implementation. The problem is the composite types. If int doesn't implement IDeepEquatable, then you have to wrap it. Same with List -- but now you end up with nested wrappers, e.g.

public struct IntWrap(int value) : IDeepEquatable<int> {
  public bool DeepEquals(int other) => value == other;
}
public struct ListWrap<T>(List<T> value) : IDeepEquatable<List<T>>
  where T : IDeepEquatable<T>
{
   public bool DeepEquals(List<T> other) => ...;
}

So to construct a ListWrap you need a List<IntWrap>, which means you need to copy the list. Even worse, imagine if someone wrote code like

public void M(List<int> list) {
  if (list is null) { ... }
}

Now that always returns false because the wrapper is never null. Repeat for all other pattern checks ad nauseam. The problem is that type classes are meant to describe functions on existing types. They're not meant to introduce new types. Introducing new types introduces new semantics.

[1] Wadler, Blott. How to make ad-hoc polymorphism less ad hoc. 1988

agocke avatar Dec 20 '23 23:12 agocke

Hmmm. What would be the roadblock if we were to make a different turn and use the actual struct type in metadata? For simple scenarios like invocations etc., it just changes where the Unsafe.As is done. For generics, though, if we have:

var list = new List<E>();
var list2 = (List<U>)list; // can this work somehow?

Fundamentally List<U> and List<E> are represented equally in memory anyway. Is this the runtime support "wall" that we would hit in this direction?

TahirAhmadov avatar Dec 21 '23 01:12 TahirAhmadov

@agocke

You may not be very well versed in type classes but you summarized fairly well.

I'll take that as a win. 😁

I can see where generics makes this a problem, and it's probably why type classes in Scala have to adhere to a particular convention of trait. They don't define the behavior of the type, they define the behavior of the witness working with the type. The witness is entirely separate, and needs to be passed as an implicit parameter. Scala 3 hides a lot of that complexity, but I think it ultimately desugars down to the same thing. Either way, neither Int32 nor List<T> can be IDeepEquatable<T>, but a method can use a given implementation of IDeepEquatable<List<int>> in order to determine if two List<int>s have deep equality.

Trying to bridge that with normal interface implementation sounds like a runtime conundrum. Type erasure may simplify the result, but it still feels like a lot of runtime work to really make this work. Trouble is, if you have a method with the signature bool TheyAreDeeplyEquals<T>(List<T> a, List<T> b) where T : IDeepEquality<T>, where the heck do you pass a witness? Is that where Unsafe.As comes in? You sneak the List<int> in as a List<Int32DeepEquatable> where Int32DeepEquatable is a struct implementation of IDeepEquatable<int>?

Admittedly I'm a bit out of my league and can see where trying to get this to work at all can create warts and friction points elsewhere.

Just, wow: SharpLab

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

namespace TestTypeClasses {

    public interface IDeeplyEquatable<in T> {
        bool DeepEquals(T other);
    }

    public struct Int32DeeplyEquatable : IDeeplyEquatable<Int32DeeplyEquatable> {
        private int _val;
        bool IDeeplyEquatable<Int32DeeplyEquatable>.DeepEquals(Int32DeeplyEquatable other) => _val == other._val;
    }

    public struct ListDeeplyEquatable<T> : IDeeplyEquatable<ListDeeplyEquatable<T>> where T : IDeeplyEquatable<T> {
        private List<T> _val;
        bool IDeeplyEquatable<ListDeeplyEquatable<T>>.DeepEquals(ListDeeplyEquatable<T> other) {
            if (_val == other._val) {
                return true;
            }
            if (_val == null || other._val == null) {
                return false;
            }
            int count = _val.Count;
            if (other._val.Count != count) {
                return false;
            }
            for (int i = 0; i < count; i++) {
                T left = _val[i];
                T right = other._val[i];
                if (left is null) {
                    if (right is not null) {
                        return false;
                    }
                }
                else if (!left.DeepEquals(right)) {
                    return false;
                }
            }
            return true;
        }
    }

    internal class Program {
        static void Main(string[] args) {
            List<int> list1 = [1, 2, 3];
            List<int> list2 = [1, 2, 3];

            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc1 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list1);
            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc2 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list2);

            bool deeplyEquals = AreDeeplyEquals(tc1, tc2);
            Console.WriteLine($"Equals: {deeplyEquals}");
        }

        static bool AreDeeplyEquals<T>(T left, T right) where T : IDeeplyEquatable<T> {
            return left.DeepEquals(right);
        }
    }
}

HaloFour avatar Dec 21 '23 04:12 HaloFour

Hi @HaloFour If you change list2 to [4, 5, 6], AreDeeplyEquals still says true, but if the 2 list have différents length, it says false. I don't know why, but I think IDeeplyEquatable<ListDeeplyEquatable<T>> works well, but IDeeplyEquatable<Int32DeeplyEquatable> doesn't. On each loop inside IDeeplyEquatable<ListDeeplyEquatable<T>>.DeepEquals, left and right have the same value. sharplab.io

FaustVX avatar Dec 21 '23 18:12 FaustVX

@FaustVX

oops, that's a bug in my implementation, I'll update my comment to reflect the working version:

            for (int i = 0; i < count; i++) {
                T left = _val[i];
                T right = _val[i]; // oops!  should be: T right = other._val[i];
                if (left is null) {
                    if (right is not null) {
                        return false;
                    }
                }
                else if (!left.DeepEquals(right)) {
                    return false;
                }
            }

The approach works, although there are warts. The type of T is ListDeeplyEquatable<Int32DeeplyEquatable> and it's a runtime error to attempt to cast it to List<int>. I understand that this approach also can create problems for GC. I assume that runtime changes are going to be required in order to support this in general, so it's not particularly surprising that we can't cleanly manage this today.

HaloFour avatar Dec 21 '23 21:12 HaloFour