motan
motan copied to clipboard
com.weibo.api.motan.rpc.DefaultResponseFuture#getValue方法内lock.wait使用错误
synchronized(lock) { while (!conditionPredicate()) lock.wait(); // 现在对象处于合适的状态 } 在调用wait之前测试条件谓词,并且从wait中返回时再次进行测试。 在一个循环中调用wait。 <<Java并发编程实战>>
这块对lock.wait的使用我没觉得有什么问题。
没有设置timeout时wait的条件是notify或者interrupted;设置了timeout时,wait的条件是notify或者waittime。
lock.wait必须在while(check)循环里使用,这个你可以查看下Object#wait方法里的注释。如果不加循环检测,最大的问题是lock.wait可能会虚假唤醒,就是说在没有得到notify/notifyAll的时候也可能会醒过来,可以参考https://en.wikipedia.org/wiki/Spurious_wakeup#Spurious_wakeup_in_Linux
线程被意外唤醒也是预期的行为之一,代码中已经进行了处理。
在没有设置超时时间的场景,不是一个显示的a condition variable
,本身就是基于不确定性考虑的,不适合使用循环来check终止条件。不是所有的wait都必须在循环中使用,需要看实际的使用场景。
这块意外唤醒后具体的处理逻辑是什么呢? 不是所有的wait都必须在循环中使用,需要看实际的使用场景。这个能否展开说明下,谢谢。
对于无超时时间的请求,能够保证意外唤醒时触发cancel操作就可以了。
关于wait的是否必须在循环中,这个见仁见智吧,我觉得不适合像bug一样在issue中讨论,可能在java技术相关的社区讨论会更好吧。
https://github.com/weibocom/motan/blob/ed0095dd65660f4464dff987202d462710c3972f/motan-core/src/main/java/com/weibo/api/motan/rpc/DefaultResponseFuture.java#L84-L93 意外唤醒并不会触发cancel操作,因为意外唤醒并不是说要抛异常。 关于wait是否必须在循环中,这个在Object#wait注释里面明确的指出必须在循环中使用: https://github.com/ZenOfAutumn/jdk8/blob/de6c37469e54d46841838423400144f7b9dc4cf1/java/lang/Object.java#L477-L485 我觉得这种核心代码每次rpc请求都会用到,所以无论再怎么考虑正确性都不为过。
What is Spurious Wakeups in Java threads? The thread on WAIT state on an object wakes up for no reason, it is neither notified, timed out nor interrupted.For some reasons it is possible for a thread to wake up even if notify() and notifyAll() has not been called. This behavior is known as spurious wakeups, Wakeups without any reason.
Spurious wakeup describes a complication in the use of condition variables as provided by certain multithreading APIs such as POSIX Threads and the Windows API. Even after a condition variable appears to have been signaled from a waiting thread’s point of view, the condition that was awaited may still be false. Simply, it means a thread can wakeup from its waiting state without being signaled or interrupted or timing out. To make things correct, awakened thread has to verify the condition that should have caused the thread to be awakened. And it must continue waiting if the condition is not satisfied.
过路的看到这个问题,没有用过motan,不过我觉得两位说得都很有道理。 在超时机制的设计上,虽然可以不设置超时,但是一般实际使用的时候都会设置,提供【无超时】这样的选项意义不大。【个人理解并参考hazelcast超时:“we should only wait if there is any timeout. We can't call wait with 0, because it is interpreted as infinite.”】
我比较赞成 @mrrao 说的,虚假唤醒并不会触发cancel操作。 作为补充,我找到了一些更强有力的证据。 OperationTimeoutException hazelcast由于类似的问题导致错误的超时异常,排查了数月才找到原因。 看他们的分析感觉不容易复现,但又时不时出现。
另外JDK也有类似例子: interrupt of wait()ing thread isn't triggering InterruptedException
JDK-7011442 : AppletClassLoader.java needs to avoid spurious wakeup
synchronized(creatorThread.syncObject) { while (!creatorThread.created) { creatorThread.syncObject.wait(); } }
JDK-7011443 : ./share/classes/sun/awt/SunToolkit.java needs to avoid spurious wakeup
synchronized (lock) { executeOnEventHandlerThread(event); while(!event.isDispatched()) { lock.wait(); } }
以上仅为个人观点,有可能以偏概全的地方,欢迎批评指正。