jetty.project icon indicating copy to clipboard operation
jetty.project copied to clipboard

Deadlock ReentrantLock$NonfairSync inbetween two threads

Open r0bertini opened this issue 1 year ago • 8 comments

Jetty version(s) 10.0.17

Jetty Environment N/A

Java version/vendor (use: java -version) java 17.0.9 2023-10-17 LTS

OS type/version Linux 5.4.17-2136.325.5.el7uek.x86_64 x86_64

Description

Found one Java-level deadlock:

"Connector-Scheduler-3e2055d6-1": waiting for ownable synchronizer 0x0000001015400340, (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is held by "qtp1272883899-10442"

"qtp1272883899-10442": waiting for ownable synchronizer 0x00000010047c3cd8, (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is held by "qtp1272883899-10431"

"qtp1272883899-10431": waiting for ownable synchronizer 0x0000001015400340, (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is held by "qtp1272883899-10442"

Java stack information for the threads listed above:

"Connector-Scheduler-3e2055d6-1":
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x0000001015400340> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly([email protected]/AbstractQueuedSynchronizer.java:958)
	at java.util.concurrent.locks.ReentrantLock$Sync.lockInterruptibly([email protected]/ReentrantLock.java:161)
	at java.util.concurrent.locks.ReentrantLock.lockInterruptibly([email protected]/ReentrantLock.java:372)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:1165)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take([email protected]/ScheduledThreadPoolExecutor.java:899)
	at java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1062)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1122)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
	at java.lang.Thread.run([email protected]/Thread.java:842)
"qtp1272883899-10442":
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x00000010047c3cd8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938)
	at java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
	at java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
	at org.eclipse.jetty.util.thread.AutoLock.lock(AutoLock.java:44)
	at org.eclipse.jetty.server.Server.getDateField(Server.java:341)
	at org.eclipse.jetty.server.HttpChannel.onRequest(HttpChannel.java:893)
	at org.eclipse.jetty.server.HttpChannelOverHttp.headerComplete(HttpChannelOverHttp.java:344)
	at org.eclipse.jetty.http.HttpParser.parseFields(HttpParser.java:1248)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1533)
	at org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:398)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
	at java.lang.Thread.run([email protected]/Thread.java:842)
"qtp1272883899-10431":
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x0000001015400340> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938)
	at java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
	at java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.remove([email protected]/ScheduledThreadPoolExecutor.java:1046)
	at java.util.concurrent.ThreadPoolExecutor.remove([email protected]/ThreadPoolExecutor.java:1761)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.cancel([email protected]/ScheduledThreadPoolExecutor.java:293)
	at org.eclipse.jetty.util.thread.ScheduledExecutorScheduler$ScheduledFutureTask.cancel(ScheduledExecutorScheduler.java:152)
	at org.eclipse.jetty.io.IdleTimeout.deactivate(IdleTimeout.java:147)
	at org.eclipse.jetty.io.IdleTimeout.onClose(IdleTimeout.java:140)
	at org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:350)
	at org.eclipse.jetty.io.SelectableChannelEndPoint.onClose(SelectableChannelEndPoint.java:165)
	at org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:344)
	at org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:263)
	at org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:227)
	at org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:210)
	at org.eclipse.jetty.server.HttpConnection.abort(HttpConnection.java:600)
	at org.eclipse.jetty.server.HttpChannel.abort(HttpChannel.java:1181)
	at org.eclipse.jetty.server.HttpChannelOverHttp.abort(HttpChannelOverHttp.java:80)
	at org.eclipse.jetty.server.HttpChannel.onBadMessage(HttpChannel.java:1021)
	at org.eclipse.jetty.server.HttpChannelOverHttp.badMessage(HttpChannelOverHttp.java:194)
	at org.eclipse.jetty.http.HttpParser.badMessage(HttpParser.java:1658)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1645)
	at org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:398)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
	at java.lang.Thread.run([email protected]/Thread.java:842)

Found 1 deadlock.

How to reproduce?

Unknown.

r0bertini avatar Feb 26 '24 09:02 r0bertini

@r0bertini I am trying to analyze your report, but the lines in the stack traces to not match the code for Jetty 10.0.17.

Are you sure of the Jetty version?

Is this reproducible? If so, can you take a jstack -e -l <pid> so that we have more information about the locks?

Your report says that thread qtp1272883899-10442 holds the lock for 0x0000001015400340, but I cannot find any lock held in the stack trace of qtp1272883899-10442.

Are you using an unmodified version of Jetty?

Are you sure you reported the correct stack traces?

sbordet avatar Feb 27 '24 18:02 sbordet

@sbordet unfortunately this is not reproducible on-demand. There are posted 3 threads qtp1272883899-10431, qtp1272883899-10442 and Connector-Scheduler-3e2055d6-1 which share 2 locks inbetween. Regarding version, my bad that is actually 10.0.18

r0bertini avatar Mar 20 '24 17:03 r0bertini

@r0bertini as I said in the previous comment, these stack traces are not enough, as nowhere it is shown who holds 0x0000001015400340.

If you can reproduce the issue, please post the full stack trace.

Also specify the Java vendor you are using.

sbordet avatar Mar 20 '24 20:03 sbordet

These two threads are waiting for synchronizer which is held by the other thread:

"qtp1272883899-10442": waiting for ownable synchronizer 0x00000010047c3cd8, (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is held by "qtp1272883899-10431"

"qtp1272883899-10431": waiting for ownable synchronizer 0x0000001015400340, (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is held by "qtp1272883899-10442"

It all boils down to: [email protected]/AbstractQueuedSynchronizer.java:715 The version is - I checked once again 10.0.17 - standard unmodified Jetty and we use standard Oracle Java 17 - jdk17.x86_64 17.0.10.

r0bertini avatar Mar 27 '24 12:03 r0bertini

@r0bertini Something is very strange here. Firstly your stacks still do not match 10.0.18 entirely. Then the thread 10442 is waiting on a _dateLock on Server. Which is used in one and only one method getDateField, which doesn't call out to other code. The thread 10431 is not withing getDateField, so it can't possible be holding that lock. Neither does the thread 3e2055d6-1

So I cannot see how the real code of 10.0.18 can deadlock on the _dateLock?

Either your code has been modified or there is something very wrong in that JVM?

Can you see in your full stack dump if there are two threads in getDateField?

gregw avatar Mar 27 '24 13:03 gregw

The JVM issue could actually be the culprit here, there were strange behaviour on numerous instances around that time. Initially we thought it was caused by this, that's why I reported that.

I only can see 0x00000010047c3cd8 lock for the getDateField but from hundreds of threads (I possess thread dumps from 4 processors and there are 500+ occurrences of each lock of the two mentioned, always with same stacktraces as initially posted)

For instance: "qtp1272883899-11593" #11593 prio=5 os_prio=0 cpu=0.15ms elapsed=4103.15s tid=0x00007f55cf6cf0b0 nid=0x5c59 waiting on condition [0x00007f5487836000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x00000010047c3cd8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938) at java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153) at java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322) at org.eclipse.jetty.util.thread.AutoLock.lock(AutoLock.java:44) at org.eclipse.jetty.server.Server.getDateField(Server.java:341) Not sure if that is something you can investigate further or if we can safely assume there was an issue with JVM?

r0bertini avatar Mar 27 '24 15:03 r0bertini

There is not much we can investigate further on, simply because it is so bizarre. It certainly looks like you have some thread holding the _dateLock, but then I cannot see how the thread with that lock can actually get stuck in there become blocked. It uses one threadlocal date formatter and creates two objects. One of those objects may call out to HttpField pre-encoders, but I've walked through that code and can see no locks. What protocols are you using HTTP1, 2? 3?

gregw avatar Mar 27 '24 16:03 gregw

We use HTTP1, OK let's leave this one for now, if there is another occurrence the team will have instructions to take even heapdumps so we can investigate deeper, thanks you both for taking look at this!

r0bertini avatar Mar 28 '24 13:03 r0bertini