starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[BugFix] Fix miss update job state when replaying restore log may cause restored table lost after FE restart

Open srlch opened this issue 1 year ago • 3 comments

Why I'm doing:

If we set history_job_keep_max_second < checkpoint time interval (no matter by background daemon or manlly create image). When we replay the log for the restore job, the final FINISHED state log will be skipped to update the job state in BackupHandler because this job is considered as a expired job controlled by history_job_keep_max_second.

But the point here is that, Only FINISHED state is skipped to be updated but not other state log, the BackupHandler will persistent the unfinished state for this job even the job is actually finished now.

When we restart FE in this case, we will get a unfinished restore job in FE and want to rewrite the original table which is restored before FE restart and cause this problem.

What I'm doing:

When we meet a expired job with FINISHED state, we can skip to update the finished state if and only if there is no other state of this job in backuphandler

Fixes #issue https://github.com/StarRocks/StarRocksTest/issues/7904

What type of PR is this:

  • [x] BugFix
  • [ ] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.3
    • [x] 3.2
    • [x] 3.1
    • [ ] 3.0
    • [ ] 2.5

srlch avatar Jun 26 '24 03:06 srlch

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

sonarqubecloud[bot] avatar Jul 12 '24 04:07 sonarqubecloud[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 8 / 8 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/backup/RestoreJob.java 2 2 100.00% []
:large_blue_circle: com/starrocks/backup/BackupHandler.java 4 4 100.00% []
:large_blue_circle: com/starrocks/backup/mv/MvRestoreContext.java 2 2 100.00% []

github-actions[bot] avatar Jul 12 '24 07:07 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Jul 12 '24 07:07 github-actions[bot]