AMQ-9448 Fix persistent scheduler deadlock
Do not fire or schedule jobs while holding read lock on store.
Please use existing org.apache.activemq.broker.scheduler.JmsSchedulerTest#testCron test to confirm that persistent scheduler currently deadlocks on CRON jobs. This is not the same deadlock as reported in AMQ-9448, but it is caused by the same reason.
I have now added a new unit test. Does existing JmsSchedulerTest#testCron unit test run successfully for you before this change?
Do you have a thread dump of the deadlock occurring?
Yes. This is the stack trace of "JobScheduler:JMS" thread blocked by itself due to acquiring read lock on store while iterating scheduled jobs in mainLoop() method and then attempting to acquire write lock to write a new scheduled job information:
"JobScheduler:JMS" #28 daemon prio=5 os_prio=0 cpu=11.62ms elapsed=117.73s tid=0x00007f43d1344b50 nid=0x185b waiting on condition [0x00007f4356cfe000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x000000008bd25150> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock([email protected]/ReentrantReadWriteLock.java:959) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl$8.visit(JobSchedulerStoreImpl.java:684) at org.apache.activemq.store.kahadb.data.KahaAddScheduledJobCommand.visit(KahaAddScheduledJobCommand.java:283) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl.process(JobSchedulerStoreImpl.java:679) at org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:495) at org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:403) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.doSchedule(JobSchedulerImpl.java:252) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.schedule(JobSchedulerImpl.java:100) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.mainLoop(JobSchedulerImpl.java:782) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.run(JobSchedulerImpl.java:699) at java.lang.Thread.run([email protected]/Thread.java:833)
Full dump: dump.txt