rocketmq icon indicating copy to clipboard operation
rocketmq copied to clipboard

[ISSUE #4463] fix broker is normal exit, but indexFile not flush disk

Open liuzongliang0202 opened this issue 2 years ago • 12 comments

Make sure set the target branch to develop

What is the purpose of the change

Only when the rocketmq indexfile is full can the disk be flushed. If the indexfile is not full, may the data in the indexfile be lost after normal exit? For example, after rocketmq exits normally, pagcache fails to refresh the disk in time after power failure.

Brief changelog

Refer https://github.com/apache/rocketmq/issues/4463

Verifying this change

XXXX

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • [x] Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • [x] Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • [x] Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • [x] Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

liuzongliang0202 avatar Jun 27 '22 01:06 liuzongliang0202

Codecov Report

Merging #4518 (72927bc) into develop (0bc3b99) will increase coverage by 0.02%. The diff coverage is 83.33%.

@@              Coverage Diff              @@
##             develop    #4518      +/-   ##
=============================================
+ Coverage      48.20%   48.23%   +0.02%     
- Complexity      5084     5136      +52     
=============================================
  Files            642      649       +7     
  Lines          42752    43041     +289     
  Branches        5591     5630      +39     
=============================================
+ Hits           20608    20759     +151     
- Misses         19639    19774     +135     
- Partials        2505     2508       +3     
Impacted Files Coverage Δ
.../org/apache/rocketmq/store/index/IndexService.java 57.36% <83.33%> (+0.37%) :arrow_up:
.../broker/subscription/SubscriptionGroupManager.java 64.70% <0.00%> (-16.48%) :arrow_down:
...apache/rocketmq/broker/client/ProducerManager.java 71.81% <0.00%> (-13.74%) :arrow_down:
...e/rocketmq/namesrv/routeinfo/RouteInfoManager.java 78.83% <0.00%> (-2.55%) :arrow_down:
.../java/org/apache/rocketmq/store/MultiDispatch.java 64.21% <0.00%> (-2.04%) :arrow_down:
...e/rocketmq/remoting/netty/NettyRemotingServer.java 57.34% <0.00%> (-1.90%) :arrow_down:
...a/org/apache/rocketmq/store/index/IndexHeader.java 92.72% <0.00%> (-1.82%) :arrow_down:
.../rocketmq/broker/filter/ConsumerFilterManager.java 72.19% <0.00%> (-1.80%) :arrow_down:
...e/rocketmq/remoting/netty/NettyRemotingClient.java 45.48% <0.00%> (-1.13%) :arrow_down:
...pache/rocketmq/store/dledger/DLedgerCommitLog.java 75.35% <0.00%> (-1.09%) :arrow_down:
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0bc3b99...72927bc. Read the comment docs.

codecov-commenter avatar Jun 27 '22 02:06 codecov-commenter

Hi @Kvicii ,

We really appreciate your effort in helping us make RocketMQ stronger. However, I've seen many times that you approved the changes immediately without careful review (e.g. this PR didn't even pass the CI). We welcome you to join us and make a contribution, but this is definitely not the behavior we want in our community.

If you'd like to learn details of RocketMQ or how to review our code, please leave me a message via email, where you could find it in my main profile.

tsunghanjacktsai avatar Jun 28 '22 02:06 tsunghanjacktsai

@tsunghanjacktsai this PR didn't pass the CI,What should I do?

liuzongliang0202 avatar Jun 28 '22 02:06 liuzongliang0202

@tsunghanjacktsai this PR didn't pass the CI,What should I do?

Find the wrong test in ci and fix the bug in local

hzh0425 avatar Jun 28 '22 07:06 hzh0425

Is it possible to add the logic of regular flushing of IndexFile, just like refreshing ConsumeQueue, StoreCheckpoint, and CommitLog regularly.

DefaultMessageStore#FlushConsumeQueueService CommitLog(FlushCommitLogService)

The indexfile is written randomly, and the checkpoint is not recorded until it is full. If an exception exits, the indexfile file after the checkpoint will be deleted and restored from the commitlog again. So there is no need to refresh regularly

liuzongliang0202 avatar Jun 29 '22 01:06 liuzongliang0202

Is it possible to add the logic of regular flushing of IndexFile, just like refreshing ConsumeQueue, StoreCheckpoint, and CommitLog regularly.

i find the wrong test in CI. I test the same logic locally without any errors, and I can debug Linux remotely without any problems. I don't know why the CI has problems.

Here are my test problem Failed tests: DLedgerMultiPathTest.multiDirsStorageTest:49 expected:<11> but was:<10>

liuzongliang0202 avatar Jun 29 '22 01:06 liuzongliang0202

Is it possible to add the logic of regular flushing of IndexFile, just like refreshing ConsumeQueue, StoreCheckpoint, and CommitLog regularly.

i find the wrong test in CI. I test the same logic locally without any errors, and I can debug Linux remotely without any problems. I don't know why the CI has problems.

Here are my test problem Failed tests: DLedgerMultiPathTest.multiDirsStorageTest:49 expected:<11> but was:<10>

@liuzongliang0202 This test might be not so stable. I could not reproduce this problem neither. You may add some logs to show more context of this test so that more information will be provided if this test fails again. @cserwen Would you like give a hand?

caigy avatar Jun 29 '22 02:06 caigy

Is it possible to add the logic of regular flushing of IndexFile, just like refreshing ConsumeQueue, StoreCheckpoint, and CommitLog regularly.

i find the wrong test in CI. I test the same logic locally without any errors, and I can debug Linux remotely without any problems. I don't know why the CI has problems. Here are my test problem Failed tests: DLedgerMultiPathTest.multiDirsStorageTest:49 expected:<11> but was:<10>

@liuzongliang0202 This test might be not so stable. I could not reproduce this problem neither. You may add some logs to show more context of this test so that more information will be provided if this test fails again. @cserwen Would you like give a hand?

@caigy I will fix it in this pr https://github.com/apache/rocketmq/pull/4523. It should sleep for a while before checking.

cserwen avatar Jun 29 '22 03:06 cserwen

@liuzongliang0202 Hi, someone had already fixed the CI problem. No worry now :-)

tsunghanjacktsai avatar Jun 30 '22 02:06 tsunghanjacktsai

@liuzongliang0202 Hi, someone had already fixed the CI problem. No worry now :-)

Thank you very much for your help

liuzongliang0202 avatar Jun 30 '22 02:06 liuzongliang0202

Coverage Status

Coverage increased (+0.08%) to 52.046% when pulling 72927bc356584cf8796de9b94cf7997d25634190 on liuzongliang0202:hotfix-indexFile into d5b4d8431c32e443e2fea3feaed391acecf951eb on apache:develop.

coveralls avatar Jul 12 '22 07:07 coveralls

Coverage Status

Coverage decreased (-0.05%) to 51.916% when pulling 72927bc356584cf8796de9b94cf7997d25634190 on liuzongliang0202:hotfix-indexFile into d5b4d8431c32e443e2fea3feaed391acecf951eb on apache:develop.

coveralls avatar Jul 12 '22 07:07 coveralls

This PR is stale because it has been open for 365 days with no activity. It will be closed in 3 days if no further activity occurs. If you wish not to mark it as stale, please leave a comment in this PR.

github-actions[bot] avatar Dec 23 '23 00:12 github-actions[bot]

This PR was closed because it has been inactive for 3 days since being marked as stale.

github-actions[bot] avatar Dec 27 '23 00:12 github-actions[bot]