hazelcastmq icon indicating copy to clipboard operation
hazelcastmq copied to clipboard

'Hazelcast instance is not active' during CamelContext shutdown

Open nhoughto opened this issue 11 years ago • 8 comments

I've got my hazelcastmq + camel integration working quite well, when trying to do a graceful shutdown sending a SIGINT to the running java process and calling .stop() on the camel context via Java's addShutdownHook(), hazelcastmq has trouble with trying to stop listeners etc here is the stack:

2014-08-15 02:28:54:545 WARN  org.apache.camel.impl.DefaultShutdownStrategy  - Error occurred while shutting down route: Consumer[hazelcastmq://queue:blah?concurrentConsumers=25]. This exception will be ignored.
com.hazelcast.core.HazelcastInstanceNotActiveException: Hazelcast instance is not active!
    at com.hazelcast.spi.AbstractDistributedObject.getService(AbstractDistributedObject.java:78)
    at com.hazelcast.queue.proxy.QueueProxySupport.removeItemListener(QueueProxySupport.java:191)
    at com.hazelcast.queue.proxy.QueueProxyImpl.removeItemListener(QueueProxyImpl.java:34)
    at org.mpilone.hazelcastmq.core.DefaultHazelcastMQConsumer$HzQueueListener.shutdown(DefaultHazelcastMQConsumer.java:400)
    at org.mpilone.hazelcastmq.core.DefaultHazelcastMQConsumer.stop(DefaultHazelcastMQConsumer.java:230)
    at org.mpilone.hazelcastmq.core.DefaultHazelcastMQContext.stop(DefaultHazelcastMQContext.java:244)
    at org.mpilone.hazelcastmq.camel.HazelcastMQCamelConsumer$SingleThreadedConsumer.stop(HazelcastMQCamelConsumer.java:118)
    at org.mpilone.hazelcastmq.camel.HazelcastMQCamelConsumer.doStop(HazelcastMQCamelConsumer.java:69)
    at org.apache.camel.support.ServiceSupport.stop(ServiceSupport.java:102)
    at org.apache.camel.util.ServiceHelper.stopService(ServiceHelper.java:141)
    at org.apache.camel.impl.DefaultShutdownStrategy.shutdownNow(DefaultShutdownStrategy.java:336)
    at org.apache.camel.impl.DefaultShutdownStrategy$ShutdownTask.run(DefaultShutdownStrategy.java:609)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Looks like its trying to stop listening to stuff, but hazelcast has already been stopped?

nhoughto avatar Aug 15 '14 02:08 nhoughto

I was able to reproduce the problem. I'll look into a fix. Thanks for the report.

mpilone avatar Aug 18 '14 00:08 mpilone

It looks like Hazelcast registers its own shutdown hook that can (and normally does) get called before your hook, therefore by the time your hook runs the Hazelcast instance is already shutdown. I found that you can work around the issue by disabling Hazelcast's hook with the code config.setProperty(GroupProperties.PROP_SHUTDOWNHOOK_ENABLED, "false");. Just make sure you shutdown Hazelcast in your own hook.

I'm going to do some more research on this but I might raise it as a Hazelcast defect. It seems like any application depending on shutdown hooks could have problems if the Hazelcast node is shutdown before the application gets a chance to cleanly shutdown.

mpilone avatar Aug 18 '14 02:08 mpilone

@mpilone, it does not work even if the false configured. please take a look at the issue (just one month ago raised) : https://github.com/hazelcast/hazelcast/issues/3695

it due to the reason for shutdown from the hazelcast nodeEngine itself.

bwzhang2011 avatar Oct 25 '14 04:10 bwzhang2011

I don't see the problem once I set the configuration property shown in my previous comment. Can you provide a test case that shows the problem even with the configuration option enabled?

This defect only occurs during shutdown of the JVM if you are using a shutdown hook because the order that the hooks execute is not guaranteed so Hz may get shutdown before the rest of your application (i.e. Spring or Camel). With the configuration flag set, Hz shouldn't register a shutdown hook and therefore you can control the shutdown order in your own hook (or via Spring destroy methods).

mpilone avatar Oct 30 '14 13:10 mpilone

@mpilone, thanks for feed back. as in hz3.3.1 it occurred occasionally but in hz3.3.2 it improved a lot. my configuration kept the normal as usual without any special usecase. I will continue to monitor the cluster env. by the way, things actual introduced just I described in hz introduced by client abort the connection which cause it destroy the node (I did set the configuration parameter to false) .

bwzhang2011 avatar Nov 10 '14 01:11 bwzhang2011

@mpilone, it's very strange that hz3.3.3 still has many problems with hz not active Exception. so if you have time, please give it a try and test for days long polling test especially for the IQueue test.

bwzhang2011 avatar Dec 03 '14 23:12 bwzhang2011

It seems that hz team in 3.3.4 (not long in the future) would make some improvement for hazelcast client where the client way should not invoke the shut down operation to make it not active.

bwzhang2011 avatar Dec 03 '14 23:12 bwzhang2011

@mpilone, how does this going on ? in my production environment, with hz3.3.5 (the last version in hz3.3.x) hazelcastInstanceNotActiveException still exists except for network switching for a long time and everything goes fine currently. next month we would upgrade to hz3.4.1 or hz3.5 snapshot for testing.

bwzhang2011 avatar Feb 15 '15 01:02 bwzhang2011