gitops-engine icon indicating copy to clipboard operation
gitops-engine copied to clipboard

fix(hooks): always remove finalizers

Open agaudreault opened this issue 5 months ago • 3 comments

Closes https://github.com/argoproj/argo-cd/issues/23226

Description

The new finalizer on hooks allow us to observe the hook state before Kubernetes tries to delete it. Multiple issue where found causing the finalizers not to be properly removed on hooks managed by the sync.

  • When using SyncFail phase hooks, if any hook fail to apply (sometimes happens with Replace=true or bad manifest), the finalizers will not be removed and the sync will be failed with ongoing running tasks. This is a problem in the gitops-engine logic.
  • When any resource fail to apply during a phase, the finalizers wont be removed from the existing hooks of that phase
  • When there is a "sync wave hook" error, the finalizers wont be removed from the existing hooks of that phase

Changes

  • ensure hook finalizers are always removed so hooks don’t get stuck by removing finalizers on any existing hook task before creating/starting a new one, and ensure the finalizer is removed before deleting BeforeHookCreation hooks.
  • Don’t preemptively fail a sync if failure hooks fail to apply. Wait for the failure hooks to complete before Failing. Otherwise, Argo CD will immediately retry while the SyncFail hooks are running.
  • Terminate ongoing hooks on errors/terminate
  • addresses a race between BeforeHookCreation and Replace=true.
  • Clarify docs
  • Add test for all finalizers termination case

Tests

  • [x] Test sync all Delete policies & finalizers removed
  • [x] Test sync existing hook with finalizers succeeds
  • [x] Test sync running + terminate remove all finalizers and running hooks
  • [x] Test sync fail + syncFailTasks
  • [x] Test sync fail + syncFailTasks fail
  • [x] test syncWaveHook fails with syncFailTasks
  • [x] test race condition with BeforeHookCreation and Replace=true

agaudreault avatar Aug 12 '25 22:08 agaudreault

Codecov Report

:x: Patch coverage is 73.20574% with 56 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 48.32%. Comparing base (8849c3f) to head (ef00d91). :warning: Report is 57 commits behind head on master.

Files with missing lines Patch % Lines
pkg/sync/sync_context.go 79.27% 30 Missing and 10 partials :warning:
...kg/utils/kube/kubetest/mock_resource_operations.go 0.00% 16 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #754      +/-   ##
==========================================
- Coverage   54.26%   48.32%   -5.95%     
==========================================
  Files          64       64              
  Lines        6164     6639     +475     
==========================================
- Hits         3345     3208     -137     
- Misses       2549     3170     +621     
+ Partials      270      261       -9     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Aug 12 '25 22:08 codecov[bot]

The gitops-engine repository is migrating to https://github.com/argoproj/argo-cd.

Can you please update the PR by resolving the conflicts / If it's not relevant anymore, feel free to close it You can always re-open it when the migration is over

ppapapetrou76 avatar Sep 11 '25 08:09 ppapapetrou76