clay icon indicating copy to clipboard operation
clay copied to clipboard

Explicit lamba capture clauses, stateless lambdas

Open jckarter opened this issue 13 years ago • 16 comments

It could be useful to allow environment captures to be made explicit in lambda syntax, if we can come up with a syntax that isn't too C++0x-ish. Having an arrow form to assert a stateless lambda would also be useful in contexts where a stateless function must be used.

jckarter avatar Jan 04 '12 03:01 jckarter

Possible extension to lambda syntax:

var y, z = 6, 7;
x => x + ref y + z  // y captured by ref, z is copied free variable
x -> x + ref y + z  // ref y is a no-op, all captured by ref

-> guarantees stateless. I think this would cover all cases. Any thoughts?

Note that I have assumed here that capturing by ref and value by the same lambda is in some way useful, it occurs that this may not be the case and the current lambda syntax is sufficient. In which case I guess this issue is no longer an issue.

ghost avatar Oct 11 '12 13:10 ghost

I think that would clash with the return ref syntax. I don't see how -> guarantees stateless—your example still captures y and z. I'm not sure how generally useful mixed capture is, but stateless lambdas have the ability over other lambdas to be turned into stateless CodePointers and CCodePointers, which is sometimes important when interfacing with C functions, and explicit capture is useful for verification or documentation purposes, if there's a local that really shouldn't be captured or something like that.

jckarter avatar Oct 11 '12 16:10 jckarter

I guess I was falsely under the impression that 'stateless' included capturing by reference, just no copies. My mistake.

ghost avatar Oct 11 '12 16:10 ghost

Technically you're right (the best kind of right), but by "stateless" I really meant "representable as a single function pointer".

jckarter avatar Oct 11 '12 16:10 jckarter

I suppose the arrow syntax could be extended with ~> for stateless. Not a very clear distinction from the current syntax though.

ghost avatar Oct 11 '12 16:10 ghost

Yeah, that's maybe the least ugly choice. Unfortunately I think explicit capture clauses are just inherently ugly; if we free up the colon by removing the trailing block sugar, one possibility would be something like args ~>:(a, ref b) body.

Another interesting case to support would be capture-by-move, which is necessary to support capture of move-only objects like UniquePointer, and is also a good optimization for things like the monad pattern, where the lambda is the last consumer of the scope and so can move it into itself.

jckarter avatar Oct 11 '12 16:10 jckarter

That's actually not too bad. If the frequency of use is quite low then a little ugliness isn't a big deal as long as the syntax is reasonably consistent/concise. Whilst on the subject of arrow syntax it may be better to restrict arrow usage just to lambdas (custom operators not included). For instance the --> named returns arrow seems superfluous:

foo(x) : Int32, Int64 {...}
foo(x) : Int32 = ... ;
foo(x) returned:Int32 {...}

it's all pretty obvious what's going on without the arrow.

Also, the corresponding initialization arrow could be dropped:

foo(x) ret:Int {

    var a = 3;
    a = 2; // assignment
    initialize(a, 6); // function call required in the rare case of re-initialization or if you want to be explicit
    ret = 7;   // initialize and assign or just assign if already initialized

    // or maybe use ':=' for explicit initialization operator
    ret := 7

}

ghost avatar Oct 11 '12 18:10 ghost

Well, initialization and named returns are unsafe and aren't meant to be used normally, which is part of the reason for the ugly three-character operators. They can't quite be replaced with a primitive, since multiple-value initialization (..a <-- ..b) needs to work. I agree though that the arrow notation is confusing. They could perhaps be replaced with an ugly-looking keyword like __init__.

jckarter avatar Oct 11 '12 18:10 jckarter

Interesting, I thought named returns were the way to go when possible to do so.

Maybe initialize named returns by default (go-esque) and use an ugly keyword to circumvent this behavior.

ghost avatar Oct 11 '12 18:10 ghost

There isn't much benefit to pre-initialized named returns. Uninitialized named returns are necessary to implement some low-level value semantics for builtin types, and are otherwise only there so you can manually perform NRVO if you don't trust the compiler. If the compiler performed automatic NRVO like C++ compilers do, and primitives were provided for low-level elementwise initialization of tuples and records, then they wouldn't be necessary at all.

jckarter avatar Oct 11 '12 18:10 jckarter

OK, I see. In that case the ugly keyword instead of arrows makes sense and the relevant compiler/primitive improvements should be raised as an issue (iirc NRVO already has a ticket) and named returns eventually removed.

ghost avatar Oct 11 '12 18:10 ghost

Yeah, eliminating named returns would be nice. A safer approach to guaranteeing NRVO would be to allow vars to be annotated saying "I really want this variable to be allocated into the return value", and having the compiler raise an error if it can't do that safely.

jckarter avatar Oct 11 '12 19:10 jckarter

Do you mean explicitly annotated by the programmer?

Something like a return-binding:

foo(z) {
    return a = z * 3; 
    var b = bar(z);
    a += b; // a is automatically returned as already declared as return value
    return b; // this is a compile error
}

Looks kind of weird though . . .

ghost avatar Oct 15 '12 17:10 ghost

I'm not sure exactly what I mean. Something like that might work. Another possibility would be to allow named returns to be declared, but have the values be bound and initialized by vars within the return value instead of having them be bound implicitly and leaving initialization up to the user:

foo(z) a:Int {
    var a = z * 3;
    ...
}

jckarter avatar Oct 15 '12 17:10 jckarter

What about operator overloading? Compiler gives a list of detected capturings to an operator and really puts in lambda what operator returned.

(->)(forward ..x) = ..x;
(#>)(..x) = move(..x);
(=>)(..x) = ..x;
(~>)() = return Tuple[];
(~>)(..x) = staticAssert(false);
(++>(..x, RefNamesList : Vector, CopyNamesList: Vector))=//static for in ..x and C++ style explicit capturing

Party hard:

(O_o>)(..x) = create_and_return_shared_ref //It's like shared_ptr, but ref
(T_T>)(..x) = gc_controlled_ref

galchinsky avatar Oct 22 '12 15:10 galchinsky

The lambda arrow isn't quite an operator because the left-hand side is parsed as an argument list rather than an expression. However, capture behavior could be handled by a hook function. In newclay, I had lambdas work by desugaring into an expression [L] captureLambda(#L, ..freeVars), where L was a symbol representing the capture type. By allowing L to be extensible, you could support custom capture implementations; as you noted in #430, having to manually cast a lambda to Function is awkward, and providing a compact syntax for Function literals would make higher-rank functional programming easier. I'm not sure how the custom symbol would look; maybe you could reserve operator symbols ending in > as lambda operators or something cheesy like that.

jckarter avatar Oct 22 '12 16:10 jckarter