shell-operator
shell-operator copied to clipboard
DeleteOperation polling is not reliable
In the patch collector, when using the delete operation, we have a poll that ensures that a resource has been deleted. It might fail if a pod is recreated within a second with the same name.
https://github.com/flant/shell-operator/blob/6b0058ccb3ce755f7a5f44b71a733b25e95f104c/pkg/kube/object_patch/patch.go#L278
Expected behavior (what you expected to happen):
When pods are deleted and re-created with the same name via patch collector, a hook should not return timeout error (false positive)
Actual behavior (what actually happened):
Deleting a pod with non-changing name (owned by a statefulset) should not result in timeout error.
Steps to reproduce:
- Apply a delete operation for a statefulset pod which starts quickly enough to rollout in 1 second
- Shell operator reports timeout error
Environment:
Deckhouse smoke-mini resheduler hook
Logs
Pod "smoke-mini-d-0" marked for deletion Module hook failed, requeue task to retry after delay. Failed count is 1. Error: 1 error occurred: * Delete object v1/Pod/d8-upmeter/smoke-mini-d-0: timed out waiting for the condition