shell-operator icon indicating copy to clipboard operation
shell-operator copied to clipboard

DeleteOperation polling is not reliable

Open shvgn opened this issue 4 years ago • 0 comments

In the patch collector, when using the delete operation, we have a poll that ensures that a resource has been deleted. It might fail if a pod is recreated within a second with the same name.

https://github.com/flant/shell-operator/blob/6b0058ccb3ce755f7a5f44b71a733b25e95f104c/pkg/kube/object_patch/patch.go#L278

Expected behavior (what you expected to happen):

When pods are deleted and re-created with the same name via patch collector, a hook should not return timeout error (false positive)

Actual behavior (what actually happened):

Deleting a pod with non-changing name (owned by a statefulset) should not result in timeout error.

Steps to reproduce:

  1. Apply a delete operation for a statefulset pod which starts quickly enough to rollout in 1 second
  2. Shell operator reports timeout error

Environment:

Deckhouse smoke-mini resheduler hook

Logs
Pod "smoke-mini-d-0" marked for deletion
Module hook failed, requeue task to retry after delay. Failed count is 1. Error: 1 error occurred:
	* Delete object v1/Pod/d8-upmeter/smoke-mini-d-0: timed out waiting for the condition

shvgn avatar Nov 30 '21 09:11 shvgn