libraft icon indicating copy to clipboard operation
libraft copied to clipboard

KayVee can hang on shutdown

Open allengeorge opened this issue 11 years ago • 3 comments

Once in a blue moon it appears that KayVee can hang on shutdown. The problem has been traced down to a failure the underlying Netty NioWorkerPool to shutdown cleanly (it appears to be waiting for a CountdownLatch to reach 0 - a condition that, for some reason, never happens).

The full stack is at: KayVee 0.1.1 Shutdown Hang Stack

allengeorge avatar Feb 14 '14 16:02 allengeorge

Another source of the hang: deadlock on KayVee 0.1.1 shutdown

allengeorge avatar Feb 18 '14 05:02 allengeorge

This is caused due to a System.exit() call within RaftAlgorithm, which can trigger a deadlock. See: http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#exit(int). The fix is to either:

  1. Detect that I'm in the middle of a shutdown and not run System.exit()
  2. Avoid System.exit() entirely

allengeorge avatar Feb 21 '14 05:02 allengeorge

OK. I think I have the exact cause here.

If I call System.exit in a Netty I/O thread the following happens:

  1. The thread locks the Shutdown.class object
  2. The Jetty ShutdownThread is invoked, which starts running the shutdown tasks we've registered
  3. One of the shutdown tasks is RaftAgent.stop(), which waits for all I/O threads to complete

And...deadlock. This is because the netty I/O thread is waiting for all the shutdown tasks to run, but they won't complete because one of the tasks is to actually shut down the I/O thread.

allengeorge avatar Mar 30 '14 19:03 allengeorge