velero icon indicating copy to clipboard operation
velero copied to clipboard

Descriptive restore error on timeout due to terminating namespace

Open kaovilai opened this issue 4 months ago • 1 comments

Signed-off-by: Tiger Kaovilai [email protected]

Thank you for contributing to Velero!

Please add a summary of your change

Make restore error descriptive when namespace being restored is in terminating state. Any user seeing this error should know that velero does not force a namespace to disappear by removing finalizers because that could be destructive to some workloads.

User should make namespace be in a state other than terminating for Velero to continue restore process.

Does your change fix a particular issue?

Fixes #5697

Please indicate you've done the following:

  • [x] Accepted the DCO. Commits without the DCO will delay acceptance.
  • [x] Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • [x] Updated the corresponding documentation in site/content/docs/main.

kaovilai avatar Feb 13 '24 19:02 kaovilai

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 61.71%. Comparing base (2518824) to head (e7ffa62).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7424   +/-   ##
=======================================
  Coverage   61.71%   61.71%           
=======================================
  Files         263      263           
  Lines       28869    28873    +4     
=======================================
+ Hits        17816    17819    +3     
- Misses       9793     9794    +1     
  Partials     1260     1260           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 13 '24 19:02 codecov[bot]

HI @kaovilai @blackpiglet we are seeing the Issue #7516. Please look into it once in the same method.

EnsureNamespaceExistsAndIsReady

During restore for every resource within the namespace, we are calling the check to await on if namespace exists (wait for 10 min polling). For instance, if we have 100 resources in a namespace that itself is in terminating state for very long then it impacts the restore flow to get stuck/halt and it too increases the time to restore as it does it for each resource in the same namespace.

mayankagg9722 avatar Mar 27 '24 10:03 mayankagg9722