camunda icon indicating copy to clipboard operation
camunda copied to clipboard

Clean Backoff column family from wrong jobs

Open nicpuppa opened this issue 1 year ago • 7 comments

Description

Due to https://github.com/camunda/zeebe/issues/14329 a new migration was introduced to restore the backoff column family with the correct data.

Perhaps, we missed to cover some cases (see https://github.com/camunda/zeebe/issues/15954).

With https://github.com/camunda/zeebe/pull/16047 a fix of the migration was added, but the wrong data were not cleaned up

This PR add a method to clean up the wrong data in the backoff column family

Related issues

closes #15954

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • [ ] The changes are backwards compatibility with previous versions
  • [ ] If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • [ ] There are unit/integration tests that verify all acceptance criterias of the issue
  • [ ] New tests are written to ensure backwards compatibility with further versions
  • [ ] The behavior is tested manually
  • [ ] The change has been verified by a QA run
  • [ ] The impact of the changes is verified by a benchmark

Documentation:

  • [ ] The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • [ ] If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Other teams: If the change impacts another team an issue has been created for this team, explaining what they need to do to support this change.

Please refer to our review guidelines.

nicpuppa avatar Feb 21 '24 17:02 nicpuppa

@nicpuppa this needs backporting down to 8.1, right?

megglos avatar Feb 22 '24 16:02 megglos

@nicpuppa this needs backporting down to 8.1, right?

I think so

nicpuppa avatar Feb 22 '24 16:02 nicpuppa

Thanks for the review @koevskinikola, I applied all the suggestions. Please have another look 👀

nicpuppa avatar Feb 23 '24 15:02 nicpuppa

@nicpuppa there seem to be some merge conflicts, can you resolve them before merging the PR?

@Zelldon would you like to review the PR before merging?

koevskinikola avatar Feb 26 '24 10:02 koevskinikola

Sorry I have limited capacity currently, but I try my best to review it

Zelldon avatar Feb 26 '24 15:02 Zelldon

I'm bit a confused right now. Could we check what cases we would now have and what this means now.

All the cases that we have are:

  • Failed job (retries > 0) are not in the backoff column (covered by migration)
  • Jobs activable with retries = 0 are in the backoff column family (this PR should fix this)

Those should be all the cases.

cc @koevskinikola @Zelldon

nicpuppa avatar Feb 27 '24 09:02 nicpuppa

@nicpuppa what's the state here for merging, feels unfortunate that this fix wasn't included in the current patch round, do you await approval by @Zelldon ?

megglos avatar Mar 07 '24 09:03 megglos

@nicpuppa what's the state here for merging, feels unfortunate that this fix wasn't included in the current patch round, do you await approval by @Zelldon ?

Not really, but if @Zelldon have time to review this PR again would be good.

nicpuppa avatar Mar 11 '24 09:03 nicpuppa

@nicpuppa @megglos Unfortunately I don't have enough capacity to give here more guidance, but I think we should evaluate more whether we miss any points, as I mentioned earlier.

As this is already the third PR fixing this related issue, I feel we might need to think more carefully about what else we have to check. It feels to get more risky. As we now might get into issues with different patch versions, and different migration states. If one has applied all patches, one none, one only single patch, does this make any issues during the next patch and migration, etc?

Zelldon avatar Mar 25 '24 13:03 Zelldon

@oleschoenburg given you have context of the original backoff issue https://github.com/camunda/zeebe/issues/14329 and worked on rolling updates, could you help out here with a review and if needed sync with @nicpuppa ? I would like to close this case but given @Zelldon comment I would appreciate if you can assist here, while your rolling update focus that could be valuable here is fresh :)

megglos avatar Mar 26 '24 07:03 megglos

@nicpuppa we could also consider merging this into 8.5.0 still. Might be risky because we have no time to test it but it's the same for the patch releases anyway :shrug:

lenaschoenburg avatar Apr 02 '24 14:04 lenaschoenburg

@nicpuppa we could also consider merging this into 8.5.0 still. Might be risky because we have no time to test it but it's the same for the patch releases anyway 🤷

I would like to ping @megglos on this

nicpuppa avatar Apr 02 '24 15:04 nicpuppa

Backport failed for stable/8.1, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.1
git worktree add -d .worktree/backport-16508-to-stable/8.1 origin/stable/8.1
cd .worktree/backport-16508-to-stable/8.1
git switch --create backport-16508-to-stable/8.1
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 16:04 backport-action

Backport failed for stable/8.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.2
git worktree add -d .worktree/backport-16508-to-stable/8.2 origin/stable/8.2
cd .worktree/backport-16508-to-stable/8.2
git switch --create backport-16508-to-stable/8.2
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 16:04 backport-action

Backport failed for stable/8.3, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.3
git worktree add -d .worktree/backport-16508-to-stable/8.3 origin/stable/8.3
cd .worktree/backport-16508-to-stable/8.3
git switch --create backport-16508-to-stable/8.3
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 16:04 backport-action

Successfully created backport PR for stable/8.4:

  • #17235

backport-action avatar Apr 02 '24 16:04 backport-action

@nicpuppa we could also consider merging this into 8.5.0 still. Might be risky because we have no time to test it but it's the same for the patch releases anyway 🤷

I would like to ping @megglos on this

I agree with Ole let's Backport this to 8.5 as well, given we include it in all patches too.

megglos avatar Apr 02 '24 18:04 megglos

/backport

megglos avatar Apr 02 '24 18:04 megglos

Backport failed for stable/8.1, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.1
git worktree add -d .worktree/backport-16508-to-stable/8.1 origin/stable/8.1
cd .worktree/backport-16508-to-stable/8.1
git switch --create backport-16508-to-stable/8.1
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 18:04 backport-action

Backport failed for stable/8.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.2
git worktree add -d .worktree/backport-16508-to-stable/8.2 origin/stable/8.2
cd .worktree/backport-16508-to-stable/8.2
git switch --create backport-16508-to-stable/8.2
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 18:04 backport-action

Backport failed for stable/8.3, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.3
git worktree add -d .worktree/backport-16508-to-stable/8.3 origin/stable/8.3
cd .worktree/backport-16508-to-stable/8.3
git switch --create backport-16508-to-stable/8.3
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 18:04 backport-action

Backport failed for stable/8.4, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin stable/8.4
git worktree add -d .worktree/backport-16508-to-stable/8.4 origin/stable/8.4
cd .worktree/backport-16508-to-stable/8.4
git switch --create backport-16508-to-stable/8.4
git cherry-pick -x 30907afdda84700183a2e40c608fb24ebece27e9 88e030e9d03e28f7916401bac6202d444ece3461 203bdb3f95cdc02c8d39cd866a5f6c54c9ea5052

backport-action avatar Apr 02 '24 18:04 backport-action

Successfully created backport PR for release-8.5.0:

  • #17236

backport-action avatar Apr 02 '24 18:04 backport-action