examples-futures icon indicating copy to clipboard operation
examples-futures copied to clipboard

having a potential deadlock in code

Open zylthinking opened this issue 2 years ago • 3 comments

That can be produced when main drop reactor early than in threads. e.g

reactor.lock().map(|mut r: MutexGuard<Box<Reactor>>| r.wake(id)).unwrap(); 
thread::sleep(Duration::from_secs(6)); 
drop(reactor);

and

block_on(mainfut);
drop(reactor);
thread::sleep(Duration::from_secs(12));

zylthinking avatar Aug 25 '21 04:08 zylthinking

As long as there are tasks queued, dopping the reactor should only decrease the refcount since it's wrapped in an Arc and each task needs a copy of the reactor to work. I'm not sure, I understand the deadlock scenario fully here. Could you provide an example I can test using the final example, for example using the playground (although I guess it will time out with the sleep intervals in your code above I can copy it and run in locally): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=91a5cdf856530eecfb46345a36fb5bbb

cfsamson avatar Aug 25 '21 22:08 cfsamson

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cf33e19fda62de240ff9cf24fe50c6be @.***/0?redirect=https%3A%2F%2Fplay.rust-lang.org%2F%3Fversion%3Dstable%26mode%3Ddebug%26edition%3D2018%26gist%3Dcf33e19fda62de240ff9cf24fe50c6be&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) I add some logs to show what happened

On Aug 26 2021, at 6:19 am, Carl Fredrik Samson @.***> wrote:

As long as there are tasks queued, dopping the reactor should only decrease the refcount since it's wrapped in an Arc and each task needs a copy of the reactor to work. I'm not sure, I understand the deadlock scenario fully here. Could you provide an example I can test using the final example, for example using the playground (although I guess it will time out with the sleep intervals in your code above I can copy it and run in locally): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=91a5cdf856530eecfb46345a36fb5bbb @./1?redirect=https%3A%2F%2Fplay.rust-lang.org%2F%3Fversion%3Dstable%26mode%3Ddebug%26edition%3D2018%26gist%3D91a5cdf856530eecfb46345a36fb5bbb&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub @./2?redirect=https%3A%2F%2Fgithub.com%2Fcfsamson%2Fexamples-futures%2Fissues%2F1%23issuecomment-905914655&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D), or unsubscribe @./3?redirect=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAIEPMZ5TBITCTARDT52QWLT6VT67ANCNFSM5CYGKCXA&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D). Triage notifications on the go with GitHub Mobile for iOS @./4?redirect=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) or Android @.***/5?redirect=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D).

zylthinking avatar Aug 27 '21 15:08 zylthinking

Thanks. Good catch!

I think the major issue is that I use drop to shut down the reactor, but can't guarantee that the shutdown signal isn't sent from one of the child threads (as you showed when the refcount reaches 1 in one of the child threads instead of the main thread as intended).

The issue is probably a bit contrived since I fire off a seperate thread to handle each event for this example as we don't have any real I/O and use sleeping threads to simulate the waiting. In a real Reactor you wouldn't fire off threads that way and rely on sleep for wakeup. One solution is to treat the timer threads as fire-and-forget since they're only created to be able to simulate waiting for an external event. We don't have to join them.

If we want to keep with best practices and join all spawned threads (even the ones that we create only for simulation). I think the cleanest solution is to not rely on Drop for cleanly shutting down the Reactor but instead provide an instance method for it so if we remove the Drop impl and instead add an instance method called shutdown on Reactor:

    fn shutdown(&mut self) {
        println!("really drop, sending quit");
        self.dispatcher.send(Event::Close).unwrap();
        println!("waiting for executor thread done");
        self.handle.take().map(|h| h.join().unwrap()).unwrap();
        println!("executor thread ended");
    }

And in "main" we need to call this instead of relying on drop to shutdown the reactor:

    println!("drop 1");
    reactor.lock().map(|mut r| r.shutdown()).unwrap();
    println!("just wait long to observe 2 threads will blocked forever");
    thread::sleep(Duration::from_secs(1000));

Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=67550c19e85246827cdb8422b63ceb44

What do you think? Do you think it's better to rewrite Drop or just treat the timer threads as fire-and-forget by not joining them?

cfsamson avatar Aug 29 '21 13:08 cfsamson