postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Pod Deletion Event Not Caught by waitForPodDeletion Function

Open SuJinpei opened this issue 6 months ago • 0 comments

Environment Information

  • Which image of the operator are you using? master branch
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? Bare Metal K8s
  • Are you running Postgres Operator in production? No (test environment)
  • Type of issue? Bug report

File Path

pkg/cluster/pod.go in the recreatePod function

Description of the Issue

Sometimes, when a pod has already been deleted, the waitForPodDeletion function in the recreatePod method fails to catch the PodDeletion Event. This causes the function to hang or timeout, even though the pod deletion has actually occurred.

Steps to Reproduce

  1. Trigger pod recreation via the recreatePod function
  2. In some cases, the pod gets deleted successfully
  3. However, the waitForPodDeletion function doesn't detect this event
  4. The process gets stuck waiting for an event that won't come

Expected Behavior

The waitForPodDeletion function should reliably detect when a pod has been deleted, regardless of timing or race conditions.

Actual Behavior

The function sometimes misses the deletion event, causing the process to hang.

Additional Information

This issue appears to be related to event handling in the operator. It might be a race condition where the pod deletion event occurs before the event listener is properly set up, or the event is somehow missed by the subscriber mechanism.

SuJinpei avatar May 28 '25 07:05 SuJinpei