ILSpy icon indicating copy to clipboard operation
ILSpy copied to clipboard

Tuples deconstructing support

Open greenozon opened this issue 5 years ago • 4 comments

ILSpy version 6.0.0.5410-alpha1

Continuing games with new features, mainly tuples deconstructing https://docs.microsoft.com/en-us/dotnet/csharp/deconstruct

input test code

 public class TestDeconstructors
        {
            private int a1, a2, a3;
            public void Deconstruct(out int a1, out int a2, out int a3)
            {
                a1 = this.a1;
                a2 = this.a2;
                a3 = this.a3;
            }

            public void Test1(TestDeconstructors other)
            {
                (a1, a2, a3) = other;
            }
        }

ILSpy:

public class TestDeconstructors
{
	private int a1;

	private int a2;

	private int a3;

	public void Deconstruct(out int a1, out int a2, out int a3)
	{
		a1 = this.a1;
		a2 = this.a2;
		a3 = this.a3;
	}

	public void Test1(TestDeconstructors other)
	{
		other.Deconstruct(out int num, out int num2, out int num3);
		a1 = num;
		a2 = num2;
		a3 = num3;
	}
}

greenozon avatar Nov 30 '19 11:11 greenozon

There's two fundamentally different language constructs here:

  1. Deconstruction introducing new local variables:
var (a, b) = expr1;
var ((c, d), (e, f)) = expr2;

These somewhat overlap with pattern matching (#2048), as patterns can also contain this style of deconstruction.

  1. Deconstruction assigning to existing expressions:
(Console.CursorLeft, Console.CursorTop) = expr1;
((Get(0).Prop, Get(1).Prop), (Get(2).Prop, Get(2).Prop)) = expr2;

These are a bit more tricky:

  • first all Get() methods are called
  • then expr2 is evaluated
  • then all Deconstruct() methods are called
  • finally all set_Prop accessors are called

dgrunwald avatar Jun 26 '20 18:06 dgrunwald

Though I guess there isn't really a difference between var (c, d, e, f) = expr2; and

int c, d, e, f;
(c, d, e, f) = expr2;

So in some sense the expression form is the more general one; it's just that the copying of the outputs can be optimized out if it's just a copy between local variables.

dgrunwald avatar Jun 26 '20 19:06 dgrunwald

ILAst representation of deconstruction

Requirements:

  • we should somehow represent the nested structure, to easily tell ((a, b), c) from (a, (b, c))
  • we should preserve the order of evaluation to avoid semantic confusion
  • for nested deconstruction, we might initially transform only a portion of the code pattern; and then later find a larger pattern when we consider an earlier starting point
  • we should allow inlining into the rhs expression and lhs target expressions, so that we can just let the inliner handle these and keep our IL pattern manageable.

Idea:

class DeconstructInstruction {
   InstructionCollection<StLoc> lhsTargetInit; // these slots allow inlining
   RecursiveMatch deconstruct; // TestedOperand is the RHS; the pattern determines the nesting structure
   Block conversions; // block with `stloc new_temp = implicit_conversion(temp_from_deconstruct)`
   Block assignments; // block with the actual assignments `call set_Prop(ldloc lhs_target_var, ldloc new_temp)`
}

Full example:

struct CustomString
{
    public static implicit operator string(CustomString s) => null;
}

class C {
	public string Prop { get; set; }

	public C Get(int i) => null;

	public void Deconstruct(out string a1, out CustomString a2)
	{
		a1 = "a";
		a2 = new CustomString();
	}

	public (C, C) GetTuple() => throw null; 

	public void Test()
    {
		((Get(0).Prop, Get(1).Prop), (Get(2).Prop, Get(2).Prop)) = GetTuple();
	}
}

The deconstruction in Test would be represented with this ILAst (which borrows some node types from #2048):

Deconstruction {
init:
    stloc lhs1(call Get(ldloc this, ldc.i4 0))
    stloc lhs2(call Get(ldloc this, ldc.i4 1))
    stloc lhs3(call Get(ldloc this, ldc.i4 2))
    stloc lhs4(call Get(ldloc this, ldc.i4 2))
deconstruct:
    match.recursive(tmp = call GetTuple()) {
        match.recursive.deconstruct(tmp1 = tmp.Item1) {
            match.var(d1 = deconstruct.result0(tmp1)),
            match.var(d2 = deconstruct.result1(tmp1))
        }
        match.recursive.deconstruct(tmp2 = tmp.Item2) {
            match.var(d3 = deconstruct.result0(tmp2)),
            match.var(d4 = deconstruct.result1(tmp2))
        }
    }
conversions: Block {
        stloc conv2(call op_Implicit(ldloc d2))
        stloc conv4(call op_Implicit(ldloc d4))
    }
assignments: Block {
        call set_Prop(ldloc lhs1, ldloc d1)
        call set_Prop(ldloc lhs2, ldloc conv2)
        call set_Prop(ldloc lhs3, ldloc d3)
        call set_Prop(ldloc lhs4, ldloc conv4)
    }
}

Invariants:

  • all init stores must be single-definition, single-use
  • the init variables are used only as assignment targets
  • the init uses are in the same order as the init stores
  • deconstruct is a recursive pattern that always succeeds
  • deconstruct does not perform any type or null-checks (=potentially throws NRE on Deconstruct calls)
  • match.recursive variables are only used within the pattern (=no designators)
  • match.var variables are used exactly once, either within conversions or assignments
  • conversions contains stloc c(conv(ldloc d)), where c is a new single-definition single-use variable; and d is from a match.var within deconstruct
  • conversions are in the same order as the corresponding match.var
  • assignments can be:
    • property/indexer calls: all arguments except the last are from the init temporaries
    • stloc: assignment to a local variable
    • stobj: address is loaded from an init temporary
      • such init temporaries must be checked with StObj.IsValidTarget
    • stfld: target is loaded from an init temporary
    • stelem: target+indices are loaded from init temporaries
      • (note that while stfld/stelem internally use stobj, the ldflda/ldelema portion must stay in the assignments block to avoid changing when a NullReferenceException happens)

dgrunwald avatar Jun 26 '20 22:06 dgrunwald

Deconstruction TODO:

  • [X] User-defined Deconstruct methods
  • [x] Tuple deconstruction
  • [ ] Using the return value of the deconstruction (var a = (b, c) = d; -- currently not planned)
  • [ ] Nested deconstruction (currently not planned)
  • [ ] Conversions
    • [X] Numeric conversions
    • [x] Reference conversions
    • [ ] User-defined conversions
    • [ ] ...
  • [x] Discards
  • [x] Deconstruction in foreach
  • [x] var (a, b) = ...;
  • [ ] Assignments to:
    • [X] local variables
    • [X] properties/indexers
    • [x] fields
    • [ ] array elements
    • [x] ref variables

siegfriedpammer avatar Aug 15 '20 18:08 siegfriedpammer