amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Bug]: AMS meet a dead lock

Open baiyangtx opened this issue 2 years ago • 0 comments

What happened?

Dead lock found in AMS

a6bfcba20621b74673dd7d16ba7cbbf

Affects Versions

master

What engines are you seeing the problem on?

No response

How to reproduce

No response

Relevant log output

Found one Java-level deadlock:
=============================
"thrift-server-OptimizeManager-1":
  waiting for ownable synchronizer 0x00000005c3813190, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "thrift-server-OptimizeManager-0"
"thrift-server-OptimizeManager-0":
  waiting for ownable synchronizer 0x00000005c381a628, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "thrift-server-OptimizeManager-1"

Java stack information for the threads listed above:
===================================================
"thrift-server-OptimizeManager-1":
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005c3813190> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess.acceptResult(OptimizingQueue.java:487)
	at com.netease.arctic.server.optimizing.TaskRuntime.lambda$complete$0(TaskRuntime.java:82)
	at com.netease.arctic.server.optimizing.TaskRuntime$$Lambda$1037/123588458.run(Unknown Source)
	at com.netease.arctic.server.persistence.StatedPersistentBase.invokeConsisitency(StatedPersistentBase.java:32)
	at com.netease.arctic.server.optimizing.TaskRuntime.complete(TaskRuntime.java:74)
	at com.netease.arctic.server.optimizing.OptimizingQueue.completeTask(OptimizingQueue.java:252)
	at com.netease.arctic.server.DefaultOptimizingService.completeTask(DefaultOptimizingService.java:167)
	at sun.reflect.GeneratedMethodAccessor177.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.netease.arctic.server.utils.ThriftServiceProxy.invoke(ThriftServiceProxy.java:56)
	at com.sun.proxy.$Proxy45.completeTask(Unknown Source)
	at com.netease.arctic.ams.api.OptimizingService$Processor$completeTask.getResult(OptimizingService.java:583)
	at com.netease.arctic.ams.api.OptimizingService$Processor$completeTask.getResult(OptimizingService.java:1)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
	at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:138)
	at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524)
	at org.apache.thrift.server.Invocation.run(Invocation.java:18)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
"thrift-server-OptimizeManager-0":
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005c381a628> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at com.netease.arctic.server.persistence.StatedPersistentBase.invokeConsisitency(StatedPersistentBase.java:29)
	at com.netease.arctic.server.optimizing.TaskRuntime.tryCanceling(TaskRuntime.java:147)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess$$Lambda$1182/1070112395.accept(Unknown Source)
	at java.util.HashMap$Values.forEach(HashMap.java:982)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess.lambda$persistProcessCompleted$10(OptimizingQueue.java:682)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess$$Lambda$1179/1723082963.run(Unknown Source)
	at com.netease.arctic.server.persistence.PersistentBase.doAsTransaction(PersistentBase.java:61)
	at com.netease.arctic.server.optimizing.OptimizingQueue.access$700(OptimizingQueue.java:74)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess.persistProcessCompleted(OptimizingQueue.java:681)
	at com.netease.arctic.server.optimizing.OptimizingQueue$TableOptimizingProcess.acceptResult(OptimizingQueue.java:516)
	at com.netease.arctic.server.optimizing.TaskRuntime.lambda$complete$0(TaskRuntime.java:82)
	at com.netease.arctic.server.optimizing.TaskRuntime$$Lambda$1037/123588458.run(Unknown Source)
	at com.netease.arctic.server.persistence.StatedPersistentBase.invokeConsisitency(StatedPersistentBase.java:32)
	at com.netease.arctic.server.optimizing.TaskRuntime.complete(TaskRuntime.java:74)
	at com.netease.arctic.server.optimizing.OptimizingQueue.completeTask(OptimizingQueue.java:252)
	at com.netease.arctic.server.DefaultOptimizingService.completeTask(DefaultOptimizingService.java:167)
	at sun.reflect.GeneratedMethodAccessor177.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.netease.arctic.server.utils.ThriftServiceProxy.invoke(ThriftServiceProxy.java:56)
	at com.sun.proxy.$Proxy45.completeTask(Unknown Source)
	at com.netease.arctic.ams.api.OptimizingService$Processor$completeTask.getResult(OptimizingService.java:583)
	at com.netease.arctic.ams.api.OptimizingService$Processor$completeTask.getResult(OptimizingService.java:1)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
	at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:138)
	at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524)
	at org.apache.thrift.server.Invocation.run(Invocation.java:18)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Found 1 deadlock.

Anything else

No response

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

baiyangtx avatar Nov 30 '23 06:11 baiyangtx