rocketmq icon indicating copy to clipboard operation
rocketmq copied to clipboard

[ISSUE #5009] Refactor scan broker active.

Open echooymxq opened this issue 3 years ago • 5 comments

Make sure set the target branch to develop

What is the purpose of the change

  • The others apis use BrokerHeartbeatManager just send heartbeat and check broker active status. The current implementation: The broker register to a controller leader and register the hearbeat manager, but if the controller leader is offlined, the other controller nodes havan't the broker heatbeat data and then scanNotActiveBroker become make no sense. Change points: remove ther heartbeat register api, register it with first heartbeat, and then every controller nodes will have the broker heartbeat.
  • Create a specified BrokerControllerManager to scan broker active health and switch master-slave. Redefine the scanNotActiveBroker logic, it will not scan every broker instance heartbeat, just check the specified master broker active, So i expose the ReplicasInfoManager the api, get the replicaInfoTable and loop it.

Brief changelog

XX

Verifying this change

XXXX

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • [x] Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • [x] Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • [x] Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • [x] Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

echooymxq avatar Sep 13 '22 02:09 echooymxq

Codecov Report

Merging #5057 (4a5c1bb) into develop (aa7a442) will increase coverage by 0.00%. The diff coverage is 53.65%.

@@            Coverage Diff             @@
##             develop    #5057   +/-   ##
==========================================
  Coverage      42.35%   42.35%           
- Complexity      8192     8194    +2     
==========================================
  Files           1060     1060           
  Lines          73108    73147   +39     
  Branches        9586     9590    +4     
==========================================
+ Hits           30962    30980   +18     
- Misses         38234    38255   +21     
  Partials        3912     3912           
Impacted Files Coverage Δ
...org/apache/rocketmq/broker/out/BrokerOuterAPI.java 17.39% <0.00%> (-0.04%) :arrow_down:
...a/org/apache/rocketmq/common/ControllerConfig.java 0.00% <0.00%> (ø)
...l/header/namesrv/BrokerHeartbeatRequestHeader.java 0.00% <0.00%> (ø)
...controller/impl/DefaultBrokerHeartbeatManager.java 70.94% <65.62%> (-6.97%) :arrow_down:
...ntroller/processor/ControllerRequestProcessor.java 24.74% <100.00%> (ø)
...e/rocketmq/store/ha/autoswitch/EpochFileCache.java 77.08% <0.00%> (-4.17%) :arrow_down:
...impl/consumer/ConsumeMessagePopOrderlyService.java 10.00% <0.00%> (-2.23%) :arrow_down:
...nt/impl/consumer/ConsumeMessageOrderlyService.java 44.91% <0.00%> (-1.76%) :arrow_down:
...mq/client/impl/consumer/RebalanceLitePullImpl.java 69.86% <0.00%> (-1.37%) :arrow_down:
... and 18 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Sep 13 '22 04:09 codecov-commenter

@RongtongJin @hzh0425 help review it.

echooymxq avatar Sep 13 '22 04:09 echooymxq

#5009 @ni-ze

echooymxq avatar Sep 15 '22 07:09 echooymxq

IMO, the channel close event lost and still can not elect master if method

controller.electMaster

failed.

ni-ze avatar Sep 21 '22 08:09 ni-ze

IMO, the channel close event lost and still can not elect master if method

controller.electMaster

failed.

You can't assume the elect master always had a error, we will depend on the scanNotActiveBroker and will worked finally.

echooymxq avatar Sep 21 '22 09:09 echooymxq

Hi @echooymxq, sorry for not reviewing this PR in time, could you continue to optimize it?

RongtongJin avatar Dec 12 '22 06:12 RongtongJin

Hi @echooymxq, sorry for not reviewing this PR in time, could you continue to optimize it?

yeah, I'll fix it according to your suggestion later.

echooymxq avatar Dec 12 '22 06:12 echooymxq