training-operator icon indicating copy to clipboard operation
training-operator copied to clipboard

Flaky test: [It] should delete redundant Pods

Open tenzen-y opened this issue 2 years ago • 3 comments

Flaky test: TFJob controller Test Scale Down [It] should delete redundant Pods.

------------------------------
• [FAILED] [30.010 seconds]
TFJob controller Test Scale Down [It] should delete redundant Pods
/home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/pod_test.go:278

  Timeline >>
  2023-07-04T05:44:36Z	DEBUG	events	Error creating: pods "test-scale-down-worker-1" already exists	{"type": "Warning", "object": {"kind":"TFJob","namespace":"default","name":"test-scale-down","uid":"74483f51-18e0-406f-a6e7-78a41b923883"}, "reason": "FailedCreatePod"}
  2023-07-04T05:44:36Z	INFO	Starting workers	{"controller": "tfjob-controller", "worker count": 1}
  2023-07-04T05:44:36Z	INFO	TFJob.kubeflow.org "test-exit-code" not found	{"tfjob": {"name":"test-exit-code","namespace":"default"}, "unable to fetch TFJob": "default/test-exit-code"}
  2023-07-04T05:44:36Z	INFO	TFJob.kubeflow.org "test-scale-down" not found	{"tfjob": {"name":"test-scale-down","namespace":"default"}, "unable to fetch TFJob": "default/test-scale-down"}
  [FAILED] in [It] - /home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/pod_test.go:338 @ 07/04/23 05:45:06.562
  << Timeline

  [FAILED] Timed out after 30.000s.
  Expected
      <bool>: false
  to be true
  In [It] at: /home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/pod_test.go:338 @ 07/04/23 05:45:06.562
------------------------------

https://github.com/kubeflow/training-operator/actions/runs/5451231793/jobs/9917258442?pr=1847#step:4:129

tenzen-y avatar Jul 04 '23 05:07 tenzen-y

Maybe, we should run each test in different namespaces. Currently, we run almost tests in the default namespace.

tenzen-y avatar Jul 04 '23 06:07 tenzen-y

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 02 '23 10:10 github-actions[bot]

/lifecycle frozen

tenzen-y avatar Oct 02 '23 14:10 tenzen-y