go2cs Optimize lambda variable captures to only those needed

Optimize lambda variable captures to only those needed

Open ritchiecarroll opened this issue 11 months ago • 1 comments

Currently the performVariableAnalysis method operates in advance of actual conversion steps to know which variables are captured in a lambda expression. This is done because Go semantics make a "copy" of the variable being executed on a lambda, and this does not match C# that keeps references to captured variables. To make semantics safely match Go behavior in C#, a temporary variable is used to copy the captured variable, then the temporary variable is captured. For example:

func sieve() {
    ch := make(chan int)
    go generate(ch)
    for {
        prime := <-ch
        fmt.Print(prime, "\n")
        ch1 := make(chan int)
        go filter(ch, ch1, prime)
        ch = ch1

        if prime > 40 {
            break
        }
    }
}

Which gets converted to C# as so:

private static void sieve() {
    var ch = new channel<nint>(1);
    var chʗ1 = ch;
    goǃ(_ => generate(chʗ1));
    while (ᐧ) {
        nint prime = ᐸꟷ(ch);
        fmt.Print(prime, "\n");
        var ch1 = new channel<nint>(1);
        var chʗ2 = ch;
        var ch1ʗ1 = ch1;
        goǃ(_ => filter(chʗ2, ch1ʗ1, prime));
        ch = ch1;
        if (prime > 40) {
            break;
        }
    }
}

Currently this copy is done every time for all captures for data types might need a copy. In the example above this is critical since "copies" of the variables should be used in the lambdas, not references to the same variables. However, this is not always necessary. Sometimes, based on use case, a reference to the variable is just fine.

Not capturing temporary variables would make converted code much simpler and easier to read, which would be ideal where possible.

Doing this involves determining when variables need to be copied before being captured in a lambda expression to maintain Go's semantics, and which do not.

In the provided example, here are the variables that should be copied and which ones shouldn't (per added comments):

func sieve() {
    ch := make(chan int)
    go generate(ch)      // ch should be copied 
    for {
        prime := <-ch
        fmt.Print(prime, "\n")
        ch1 := make(chan int)
        go filter(ch, ch1, prime)  // ch and ch1 should be copied, but prime does not need to be copied
        ch = ch1   // This modification is why ch needs copying
        if prime > 40 {
            break
        }
    }
}

I believe the core requirements for this task are as follows (although there could be other use cases not considered):

A variable needs to be copied before capture if:

It's used in a lambda expression
AND it's modified after being captured
AND it's a reference type (channel, slice, map, etc.)
OR it's a loop variable in a lambda context

Variables should NOT be copied if:

They're basic types (int, string, etc.) unless they're loop variables
They're never modified after capture
They're not used in a lambda expression

The challenge is to revise the code, wherever necessary, e.g., performVariableAnalysis, visitGoStmt, visitDeferStmt, etc., to make this optimization happen.

Jan 06 '25 02:01 ritchiecarroll

go2cs go2cs copied to clipboard

Optimize lambda variable captures to only those needed

go2cs
go2cs copied to clipboard