Sentinel
Sentinel copied to clipboard
Fix concurrent error in sentinel cluster client
Describe what this PR does / why we need it
In TokenClientPromiseHolder.completePromise(int xid, ClusterResponse<T> response)
, we get the promise
from PROMISE_MAP
, and the promise
may be null. There are two possible situations:
- The response is timeout and
NettyTransportClient
remove it throughTokenClientPromiseHolder.remove(xid)
(OK) - The response is received but
NettyTransportClient
haven't invokeputPromise(xid, promise)
(concurrent error, mostly GC hanppens)
This PR will fix it to avoid client getting timeout exception in promise.await
Does this pull request fix one issue?
Fixes #2852
Describe how you did it
-
TokenPromise
useSynchronousQueue
to avoid concurrent error (like chan in golang) -
TokenPromise
usestartTime
andpoll
to avoid client blocking too long -
TokenClientPromiseHolder
useThreadPool
to avoidTokenClientHandler
blocking too long inchannelRead
Describe how to verify it
In TokenPromiseTest
, there are three unit test cases
- normalTest : rt is 3ms
- FastConsumerTest : rt is 0ms, response is received before put promise
- timeoutTest : rt is 20ms
Could you please sign the CLA: https://cla-assistant.io/alibaba/Sentinel?pullRequest=2861
Could you please illustrate your design in PR description?
Could you please illustrate your design in PR description?
done
Hi @XHao , if the concurrent issue is caused by the channel.writeAndFlush(request);
before the TokenClientPromiseHolder.putPromise(xid, promise);
. why we can't solve it by move the TokenClientPromiseHolder.putPromise(xid, promise);
ahead?
Hi @XHao , if the concurrent issue is caused by the
channel.writeAndFlush(request);
before theTokenClientPromiseHolder.putPromise(xid, promise);
. why we can't solve it by move theTokenClientPromiseHolder.putPromise(xid, promise);
ahead?
Sorry, I made a stupid mistake. 😅
channel.writeAndFlush()
has no business with channel.newPromise()
, so move TokenClientPromiseHolder.put(xid, channel.newPromise());
ahead can solve this problem.
This PR is too complicated, and I will close it.😓