rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Clone into closures

Open jonhoo opened this issue 6 years ago • 42 comments

When working with move closures, and especially when spawning new threads, I often find myself having to clone a bunch of state (usually Arcs, but not always). This is either because I want that state to be shared between many instances of a closure or with the closure and the code below the closure (e.g., Arc/Rc), or for mundane reasons like I need to have a String in each closure, and since thread::spawn requires 'static, I need to clone. In futures-based code, this happens a lot with Rc<Self> where I need to carry self along into futures chained with and_then. Usually, this results in code like this:

for _ in 0..nthreads {
    let foo = Arc::clone(&foo);
    let bar = Arc::clone(&bar);
    let baz = Arc::clone(&baz);
    thread::spawn(move || {

This appears a lot in the wild. Some very quick sampling yields: [1], [2], [3], [4], [5], [6], [7], [8].

Even if you're just spawning a single thread, you often need to use weirdly tweaked variable names. For example, this code:

let foo = Arc::new(...);
thread::spawn(move || {
    foo.baz();
});
foo.baz();

must be written as

let foo = Arc::new(...);
let foo_clone = foo.clone();
thread::spawn(move || {
    foo_clone.baz();
});
foo.baz();

or if you want to retain the foo name (which you often do), you need to introduce an extra artificial scope:

let foo = Arc::new(...);
{
    let foo = foo.clone();
    thread::spawn(move || {
        foo.baz();
    });
}
foo.baz();

I don't have a concrete proposal for how this would be fixed. It'd be really neat to have something like clone ||, but that's probably overly simplistic as often you want to move some variables and clone others. Some better ergonomics for this would be great!

jonhoo avatar Apr 19 '18 02:04 jonhoo

A c++-lambda-ish macro could do all that boilerplate for us.

oli-obk avatar Apr 20 '18 12:04 oli-obk

Lambda captures are one of the rare points where I think C++ did it better than Rust: they are explicit and you can express how to capture on a per-capture basis (in Rust that would be by reference, by Clone, by move, by some arbitrary expression (C++ allows [x=f(y)])).

mcarton avatar Apr 20 '18 18:04 mcarton

I like the idea of clone ||, I run into this quite a bit.

For reference, C++-style explicit lambda captures were considered pre-1.0. IIRC the reason for what we have now was that it requires fewer language features- you can get the same effects via the combination of a) default capture-by-reference and b) move closures that can capture temporary bindings.

In the limit it's not even very different syntactically:

[x = abc, y = def, &z]() { ... x ... y ... z }
{ let x = abc; let y = def; let z = &z; move || { ... x ... y ... z } }

I think what tips the balance is that C++'s equivalent of .clone() is the default there, so when dealing with reference counting Rust has quite a bit more noise. So in the interest of keeping clone explicit (doing away with it is a question for another thread, I think), clone || is certainly a nice way to do things.

The hard part is that move || has a natural opt-out- you selectively capture-by-move &Ts. That still works for clone, but I'm not sure if there's a great way to opt-out "half way" and selectively move Ts.

rpjohnst avatar Apr 20 '18 22:04 rpjohnst

macro could do all that boilerplate

I can't find the one I'm thinking about, but I know that macros exist for this. A quick search found:

Although one of those looks real old.

shepmaster avatar Apr 20 '18 23:04 shepmaster

Scala has a proposal something similar to C++'s explicit capturing annotations in the form of spores. I think the corresponding Rust syntax would look a lot like what @rpjohnst suggested above

use std::sync::Arc;
use std::sync::Mutex;
use std::thread;

fn main() {
    let foo = Arc::new(Mutex::new(0));
    thread::spawn({ let foo = foo.clone(); move || {
        let _x = foo.lock();
    }});
    let _y = foo.lock();
}

RalfJung avatar Apr 21 '18 13:04 RalfJung

While that syntax seems fine if you only need to manipulate a few variables, it gets quite hairy in situations like this. That's why I proposed copy || so that you could change the default behavior to clone. I think it's quite common to either copy most things or move most things into a closure (especially in the context of thread::spawn or making futures).

jonhoo avatar Apr 21 '18 16:04 jonhoo

What about a clone_all macro?

Hygiene stops at passed-in identifiers. A macro can take that identifier and create a let binding with it.

You can have let $name = $name.clone().

Maybe this macro should be built-in: https://play.rust-lang.org/?gist=f3e8daccce06320e831aeb29dc26ea5c&version=stable&mode=debug

SoniEx2 avatar Apr 28 '18 18:04 SoniEx2

So this:

        let args = Arc::clone(args);
        let quiet_matched = quiet_matched.clone();
        let paths_searched = paths_searched.clone();
        let match_line_count = match_line_count.clone();
        let paths_matched = paths_matched.clone();
        let bufwtr = Arc::clone(&bufwtr);
        let mut buf = bufwtr.buffer();
        let mut worker = args.worker();

Becomes:

        let args = Arc::clone(args);
        clone_all!(quiet_matched, paths_searched, match_line_count, paths_matched);
        let bufwtr = Arc::clone(&bufwtr);
        let mut buf = bufwtr.buffer();
        let mut worker = args.worker();

It should also be possible to modify the macro to take optional mut before $i:ident, so you can do this:

clone_all!(a, mut b, c);

SoniEx2 avatar Apr 28 '18 19:04 SoniEx2

Aaand I just make a crate for it. because y'know, I just had to. https://crates.io/crates/clone_all

SoniEx2 avatar Apr 28 '18 19:04 SoniEx2

There are many scenarios where you should use either Cow::from(..)/Cow::Borrowed(..) or else do let foo = Arc/Rc::new(foo) followed by .clone(), instead of .clone() itself.

In any case, you could address the syntax directly without special casing clone with capture expressions that evaluate just outside the closure being created, so

foo(|..| { ... capture { Cow::from(bar) } ... })

becomes

{  let capture0 = Cow::from(bar);  foo(|..| { ... capture0 ... })  }

We avoid noise around the |..| too, while supporting more scenarios ala Cow.

Ideally, you could do this via a capture! proc macro crate, rather extending rust itself, which avoids any debate about "complexity budget". I've never looked into proc macros that access the AST outside their call site, but if that works then that's clearly the way to go.

burdges avatar Apr 29 '18 16:04 burdges

There's this enclose macro that's used in a load of places including stdweb:

macro_rules! enclose {
    ( ($( $x:ident ),*) $y:expr ) => {
        {
            $(let $x = $x.clone();)*
            $y
        }
    };
}

Example use: https://github.com/rust-webplatform/rust-todomvc/blob/master/src/main.rs#L142

Diggsey avatar Apr 29 '18 23:04 Diggsey

@SoniEx2 would you consider adding this to your crate? https://gist.github.com/rust-play/1cc71ed137157bb8115a9f063aabbea3

It allows using clone_all! on self.attribute:

struct A {
    x: String,
    y: String,
    z: String,
}
impl A {
    fn foo(&self) {
        clone_all!(self.x, self.y, self.z);
        (move || {
            println!("{} {} {}", x, y, z);
        })();
    }
}

lukaspiatkowski avatar Jul 10 '18 01:07 lukaspiatkowski

@lukaspiatkowski I encourage ppl to fork the things I've made. The whole point of free software is the ability to fork things, so please fork my things! (I'm not sure why ppl don't fork things...)

SoniEx2 avatar Jul 10 '18 01:07 SoniEx2

@SoniEx2 Sure, I never got the hang of it, every time I try to contribute to open source libraries the authors are not very helpful, so its hard to start doing so. I guess it comes with practice.

I have send a pull request: https://bitbucket.org/SoniEx2/clone_all/pull-requests/1

lukaspiatkowski avatar Jul 10 '18 12:07 lukaspiatkowski

This is a bit out of this issue's scope, but related: what if I want to selectively move variables into the closure? One example in which this came up today:

let v = vec!["a", "b", "c"];

rayon::scope(|s| {
    for i in 0..9 { // `i` does not live long enough
        s.spawn(|_| {
            println!("{}", v[i % 3])
        });
    } // borrowed value only lives until here
}); // borrowed value needs to live until here

println!("{:?}", v);

playground link

I really don't want to move v, nor do I want to clone it. But I have to move i. However, today I can only either move everything or capture everything by reference.

This is a very simple example, but as shown in some comments above, it's not unusual to come across situations in which the workarounds are quite verbose. There's a clear need (in my view) of more granularity.

I don't have any good suggestions at the moment, but something analogous to C++'s capture list would be handy. This macro (linked above as well) is one way to handle this, but I really think there should be a solution built into the language.

cauebs avatar Aug 16 '18 04:08 cauebs

@cauebs move is strictly more general than not moving. To capture by reference, do something like

{ let v = &v;
  move |...| {
  /* use v, it will be a reference */
  }
}

This is how it looks in your example. It costs two extra lines, which is not great but also not horrible.

RalfJung avatar Aug 16 '18 06:08 RalfJung

Or nikomatsakis' suggested form of the same, which places the borrow in the closest possible location, right when providing the closure:

rayon::scope(|s| {
    for i in 0..10 {
        s.spawn({
            let v = &v;
            move |_| println!("{}", v[i % 3])
        });
    }
});

This may be what @RalfJung meant, but the borrow can occur later than shown so I was a bit confused. Normally I'd have expected to see their code as:

rayon::scope(|s| {
    let v = &v;
    for i in 0..10 {
        s.spawn(move |_| println!("{}", v[i % 3]));
    }
});

shepmaster avatar Aug 16 '18 12:08 shepmaster

@RalfJung As I said, it was a simple example. But imagine there were dozens of variables that I wanted to be captured by reference, and only a single one that I wanted to be moved. That one is the exception, and that should be expressable in the code. I shouldn't have to change the default (moving everything) and treat all the others as exceptions.

I come from Python and have, with time, learned that if something is simple to explain (and reasonably common), it should be just as simple to encode. Otherwise, the language might be lacking some expressiveness.

cauebs avatar Aug 16 '18 15:08 cauebs

@cauebs AFAIK you can also do let v = *v; or something similar above the closure to express "capture v by move", just as let v = &v; expresses "capture v by ref".

The idea of capture lists has been raised many times, but I've never seen a concrete syntax proposal that's significantly more concise than adding a few let statements before the closure without becoming cryptic. I don't think any language besides C++ has ever needed capture lists, and I believe C++ only needs them because it would be wildly unsafe to try and elide them in that language.

Ixrec avatar Aug 16 '18 16:08 Ixrec

I think let v = *v; would do something different (e.g. dereference a String to str), but let v = {v}; (or just {v} if there’s only one use site) might be a way to force a move: https://bluss.github.io/rust/fun/2015/10/11/stuff-the-identity-function-does/

SimonSapin avatar Aug 17 '18 12:08 SimonSapin

I think @Diggsey enclose! is a very neat solution to this, would it make sense to have an RFC specifically to have this in stdlib?

phhusson avatar Jan 22 '20 10:01 phhusson

I independently ran into this myself. Is there a reason we wouldn't want to add a bit of extra syntax specifically for cloning bindings at the point of closure? The suggestion I thought up was a clone keyword, like the move keyword.

let clonable = ...;
// compiler assumes _.clone() for all bindings used within the closure
let c = clone || { clonable.do_something(); };
clonable.do_something_else();

Also, I considered you might want to mix move and clone:

// compiler clones only the specified binding and moves the rest
let c = move + clone clonable || {clonable.do_something;};

or possibly even cloning multiples:

// compiler clones the specified bindings, and moves the rest
let c = move + clone{clonable1, clonable2} || {...etc...};

The macro solution is interesting, but it feels to me more like a workaround to something that should be in the language.

AberrantWolf avatar May 02 '20 05:05 AberrantWolf

@AberrantWolf Using macros isn't considered a workaround, it's considered a good idiom. The smaller that the Rust language can be, the better. So if something can be implemented as a library, it's considered better to implement it as a library.

Rust doesn't even have (non-const) global variables, but you can use the lazy_static macro to create global variables. It also doesn't have an easy way to construct dictionaries, but you can use the maplit macros. Similarly, vec!, println!,assert!, panic!, matches!, and thread_local! are macros.

Macros are considered an integral part of Rust, and they are widely used for all sorts of things. They aren't a hack.

I personally would be in favor of built-in support for this, but the bar for new syntax and features is very high. So rather than new syntax, it's more likely that a clone! macro could be added to stdlib.

Pauan avatar May 02 '20 06:05 Pauan

I suspect our existing syntax provides the only intuitive choice:

let f = {
    let x = x.clone();
    let y = &y;
    move |w| ..x..y..z...;
};

There are no intuitive capture list proposals because they all increase language strangeness with new grouping operations.

We discussed clone || ... which clones any Clone captures, from which one recaptures normal behavior like:

let g = {
    let y = &y;  // Copy &T with &T: Clone  
    let z = NoClone(z);  // force move
    move |w| ..x..y..z...;
};

I believe clone || ... only makes learning the language harder because new users learn far more from reading f than reading g.

We also danced around some run_outside_this_closure! builtin that avoids capture lists, but the same critique applies. Instead I'd want NLL improvements so that if x: 'x then f: 'x + FnOnce() -> impl FnOnce() meaning the second closure relinquishes the lifetime 'x in

let f = || { let x = x.clone(); move || ..x.. }

There are good reasons for HashMap: Clone and ChaChaRng: Clone but also security implications because they involve secret randomness. If accidental clones become more endemic, then crate developers will respond by rejecting Clone for more types, which breaks #[derive(Clone)] and even demands crate forks. Ugh!

We already need an anti-clone lint for types like HashMap and ChaChaRng so that invoking HashMap::clone triggers a warning.

burdges avatar May 02 '20 08:05 burdges

There's no way we would use the syntax clone |...| ..., as it is ambiguous with the | and || operators.

let clone = false;
let bool_var = true;
let res = clone || bool_var;

and we won't make clone a keyword either, as this either introduces some very specialized contextual rules (clone is a keyword before | and || but an identifier otherwise), or you need to use foo.r#clone() everywhere.

Forms including a keyword like move + clone |...| ... does work (though ugly).

kennytm avatar May 02 '20 09:05 kennytm

Maybe it is enough to use proc-macro with explicitly listed arguments instead of some syntax extensions? That approach already works pretty fine I think. Though, it is still requires stmt_expr_attributes and proc_macro_hygiene nightly features.

l4l avatar May 03 '20 16:05 l4l

Btw, in the meantime, there is a macro for this: https://crates.io/crates/closure

It supports clone (e.g. closure!(clone first, clone mut second, || { .. }), move (e.g. closure!(move first, move mut second, || { .. }), ref (e.g. closure!(ref mut a, ref b, || { .. })) etc. as well as capturing by any other x.$ident() (clone is not a special case, e.g. to_string also works). Also members (any depth), e.g. closure!(ref self.foo.bar, || println!("{}", bar))();

The only disadvantage is that if you use this macro, rustfmt won't format the body of the closure (which really sucks)! (And the fact that if you already wrote the closure like move |..| you have to remove the move keyword, as well as navigating to the end of the closure body and inserting ) after it.)


Ideally a way that doesn't require wrapping the closure body in parens would be preferrable. Such as:

move[clone foo, clone mut bar]|arg| { .. }

(This just requires a single cursor placement (between move and |arg|) and you can specify everything there.) How about this? :)

So then, this

    let args = Arc::clone(args);
    let quiet_matched = quiet_matched.clone();
    let paths_searched = paths_searched.clone();
    let match_line_count = match_line_count.clone();
    let paths_matched = paths_matched.clone();
    let bufwtr = Arc::clone(&bufwtr);
    let mut buf = bufwtr.buffer();
    let mut worker = args.worker();
    Box::new(move |result| {

would look like this:

    let args = Arc::clone(args);
    let bufwtr = Arc::clone(&bufwtr);
    Box::new(
        move[
            clone quiet_matched,
            clone paths_searched,
            clone match_line_count,
            clone paths_matched,
            buffer mut bufwtr,
            worker mut args,
        ]|result| {
            ..
        }
    )

Boscop avatar May 05 '20 20:05 Boscop

Btw, in the meantime, there is a macro for this:

There are many, as discussed in

  • https://github.com/rust-lang/rfcs/issues/2407#issuecomment-383244191
  • https://github.com/rust-lang/rfcs/issues/2407#issuecomment-385291238
  • https://github.com/rust-lang/rfcs/issues/2407#issuecomment-385198310

shepmaster avatar May 06 '20 01:05 shepmaster

I think move[..] |..| .. seems too niche, meaning it complicates reading and learning the language.

Instead, I'd suggest some external crate provide a "rebind" macro that avoids the repetition under more "self-like" situations, ala let foo = &foo.bar();, let foo = &mut foo.bar(); etc.

rb! {
    quiet_matched.clone();
    my_paths.iter().map(Path::to_path_buf).cloned().collect::<Vec<_>>();
    // Avoidt HashMap: Clone because it enables DoS attacks.
    my_hashmap.iter().map(|k,v| (k.clone(),v.clone())).collect::<HashMap<K,V>>()
    my_cow.into_owned();
    my_refcell.into_inner();
    my_mutex.into_inner();
    // I think Arc/Rc::clone/try_unwrap cannot work so easily.
};
Box::new(move |result| { ... })

We tie rb! closely to let foo = foo... which makes it less strange. It'd work for field puns in structs and enums too, which makes rb! more useful and slightly less strange.

burdges avatar May 06 '20 08:05 burdges

I think let v = *v; would do something different (e.g. dereference a String to str), but let v = {v}; (or just {v} if there’s only one use site) might be a way to force a move: https://bluss.github.io/rust/fun/2015/10/11/stuff-the-identity-function-does/

Bafflingly, only when v's type isn't Copy! This runs fine, but uncomment the #[derive(Copy)] and it all falls apart like a house of cards.

nightkr avatar Feb 17 '21 01:02 nightkr