kapp icon indicating copy to clipboard operation
kapp copied to clipboard

enhance waiting for webhook to be available to accept a request

Open cppforlife opened this issue 3 years ago • 2 comments

Describe the problem/challenge you have https://gist.github.com/cppforlife/2274082fef6a5e10cedfaa473238b441 -- newly registered webhook is preventing us to save bookkeeping annotations on the deployment, so we error. then we retry later to create resource again, but resource has been already created -- we just didnt save bookkeeping annotations.


Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible" 👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

cppforlife avatar Oct 29 '21 12:10 cppforlife

slack thread: https://vmware.slack.com/archives/C02D60T1ZDJ/p1635515232432700

renuy avatar Nov 02 '21 12:11 renuy

Triaging conclusions: In this issue the resource was created successfully, however the kapp metadata update had failed, triggering the retry for resource creation.

  • The resource along with some metadata is created [Failures here, can lead to retry resource creation].
  • As part of this we then proceed to update the metadata, once the resource creation is successful. However the metadata update can fail and we retry that for one minute.
  • If the metadata update is still failing, we will return failure, re-triggering retry for resource creation. When we re-try , we exit with the error mentioned in the issue.
  • Saving this metadata is important for Kapp, as diffing logic will be based on this metadata.

Discussion points:

  1. If metadata update failed, should we continue with deployment of the app. Impact: diffing info will not be available, and next time we deploy app, we will see diff in app, which is not actually present, leading to some confusion for users.
  2. If metadata update failed and resource creation succeeded, should we fail the whole deployment causing inconvenience to the users/scripts
  3. Are there any other options?

sethiyash avatar Nov 16 '21 14:11 sethiyash