activemq icon indicating copy to clipboard operation
activemq copied to clipboard

AMQ-9448 Fix persistent scheduler deadlock

Open thezbyg opened this issue 1 year ago • 4 comments

Do not fire or schedule jobs while holding read lock on store.

thezbyg avatar Mar 16 '24 08:03 thezbyg

Please use existing org.apache.activemq.broker.scheduler.JmsSchedulerTest#testCron test to confirm that persistent scheduler currently deadlocks on CRON jobs. This is not the same deadlock as reported in AMQ-9448, but it is caused by the same reason.

thezbyg avatar Mar 16 '24 08:03 thezbyg

I have now added a new unit test. Does existing JmsSchedulerTest#testCron unit test run successfully for you before this change?

thezbyg avatar Mar 20 '24 06:03 thezbyg

Do you have a thread dump of the deadlock occurring?

mattrpav avatar May 31 '24 19:05 mattrpav

Yes. This is the stack trace of "JobScheduler:JMS" thread blocked by itself due to acquiring read lock on store while iterating scheduled jobs in mainLoop() method and then attempting to acquire write lock to write a new scheduled job information: "JobScheduler:JMS" #28 daemon prio=5 os_prio=0 cpu=11.62ms elapsed=117.73s tid=0x00007f43d1344b50 nid=0x185b waiting on condition [0x00007f4356cfe000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x000000008bd25150> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock([email protected]/ReentrantReadWriteLock.java:959) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl$8.visit(JobSchedulerStoreImpl.java:684) at org.apache.activemq.store.kahadb.data.KahaAddScheduledJobCommand.visit(KahaAddScheduledJobCommand.java:283) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl.process(JobSchedulerStoreImpl.java:679) at org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:495) at org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:403) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.doSchedule(JobSchedulerImpl.java:252) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.schedule(JobSchedulerImpl.java:100) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.mainLoop(JobSchedulerImpl.java:782) at org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.run(JobSchedulerImpl.java:699) at java.lang.Thread.run([email protected]/Thread.java:833)

Full dump: dump.txt

thezbyg avatar Jun 01 '24 06:06 thezbyg