Error count optimization
Now behavior
com.weibo.api.motan.transport.netty4.NettyClient
- Every successful invoke will resetErrorCount() and set volatile errorCount thats not unnecessary and
if (state.isUnAliveState()) {
long count = errorCount.longValue();
if (count < maxClientConnection) {
// Should not reach here <<---
state = ChannelState.ALIVE;
}
}
- A little concurrent problem When errorCount=9 and maxClientConnection=10
Thread1 : errorCount.incrementAndGet() -> errorCount(9->10) and lowed down by GC or sync Thread2 : errorCount.incrementAndGet() -> errorCount(11->12) and lowed down by GC or sync Thread3 : errorCount.incrementAndGet() -> errorCount(12->13) and lowed down by GC or sync Thread4 : errorCount.incrementAndGet() -> errorCount(13->14) and lowed down by GC or sync Thread4 : errorCount.set(0) -> errorCount(14->0)
Then Nothing happened or wait for another maxClientConnection
Optimization
1. use get() combine accumulateAndGet. Set state=ChannelState.ALIVE after reconnect successfully and not here.
private LongBinaryOperator resetErrorCntOp = (prev, zero) -> prev < maxClientConnection ? zero : prev;
void resetErrorCount() {
if (errorCount.get() != 0L && state.isAliveState()) {
errorCount.accumulateAndGet(0L, resetErrorCntOp);
}
}
2. use incrementAndGet() == maxClientConnection to trriger
void incrErrorCount() {
if (errorCount.incrementAndGet() == maxClientConnection && state.isAliveState()) {
LoggerUtil.error("NettyClient unavailable Error: url=" + url.getIdentity() + " "
+ url.getServerPortStr());
state = ChannelState.UNALIVE;
}
}
That looks clearer and little more efficient.
Motan version
1.1.6
JVM version (e.g. java -version)
java version "1.8.0_131"
@sunnights PTAL
Now behavior
- If there are three channels in a client and one channel is unalive but other two channels are still alive because of LB or something else . The client may always be alive and fail to send request.
Optimization
- Should we have a map (channel -> channel`s errorCnt) in the client to record every channel in the client. If a channel`s errorCnt exceeded, we try reconnect in a other thread. The alive state of client may based of the cnt or percent of alive channels.