go2cs
go2cs copied to clipboard
Optimize lambda variable captures to only those needed
Currently the performVariableAnalysis method operates in advance of actual conversion steps to know which variables are captured in a lambda expression. This is done because Go semantics make a "copy" of the variable being executed on a lambda, and this does not match C# that keeps references to captured variables. To make semantics safely match Go behavior in C#, a temporary variable is used to copy the captured variable, then the temporary variable is captured. For example:
func sieve() {
ch := make(chan int)
go generate(ch)
for {
prime := <-ch
fmt.Print(prime, "\n")
ch1 := make(chan int)
go filter(ch, ch1, prime)
ch = ch1
if prime > 40 {
break
}
}
}
Which gets converted to C# as so:
private static void sieve() {
var ch = new channel<nint>(1);
var chʗ1 = ch;
goǃ(_ => generate(chʗ1));
while (ᐧ) {
nint prime = ᐸꟷ(ch);
fmt.Print(prime, "\n");
var ch1 = new channel<nint>(1);
var chʗ2 = ch;
var ch1ʗ1 = ch1;
goǃ(_ => filter(chʗ2, ch1ʗ1, prime));
ch = ch1;
if (prime > 40) {
break;
}
}
}
Currently this copy is done every time for all captures for data types might need a copy. In the example above this is critical since "copies" of the variables should be used in the lambdas, not references to the same variables. However, this is not always necessary. Sometimes, based on use case, a reference to the variable is just fine.
Not capturing temporary variables would make converted code much simpler and easier to read, which would be ideal where possible.
Doing this involves determining when variables need to be copied before being captured in a lambda expression to maintain Go's semantics, and which do not.
In the provided example, here are the variables that should be copied and which ones shouldn't (per added comments):
func sieve() {
ch := make(chan int)
go generate(ch) // ch should be copied
for {
prime := <-ch
fmt.Print(prime, "\n")
ch1 := make(chan int)
go filter(ch, ch1, prime) // ch and ch1 should be copied, but prime does not need to be copied
ch = ch1 // This modification is why ch needs copying
if prime > 40 {
break
}
}
}
I believe the core requirements for this task are as follows (although there could be other use cases not considered):
A variable needs to be copied before capture if:
- It's used in a lambda expression
- AND it's modified after being captured
- AND it's a reference type (channel, slice, map, etc.)
- OR it's a loop variable in a lambda context
Variables should NOT be copied if:
- They're basic types (int, string, etc.) unless they're loop variables
- They're never modified after capture
- They're not used in a lambda expression
The challenge is to revise the code, wherever necessary, e.g., performVariableAnalysis, visitGoStmt, visitDeferStmt, etc., to make this optimization happen.