descheduler icon indicating copy to clipboard operation
descheduler copied to clipboard

[e2e]: add waiting pod ready and continue the rest logic

Open JaneLiuL opened this issue 3 years ago • 6 comments
trafficstars

background: pull request always fail at pod pending as below so need to waiting pod ready and continue the rest logic. for the TestTopologySpreadConstraint i adjust the timeout, for the TestTooManyRestarts i add wait the pod to be scheduled, then it can have podList.Items[0].Status.ContainerStatuses

 RUN   TestTooManyRestarts
    e2e_toomanyrestarts_test.go:50: Creating testing namespace TestTooManyRestarts
    e2e_toomanyrestarts_test.go:86: Creating deployment restart-pod
    e2e_toomanyrestarts_test.go:196: Waiting for podList.Items[0].Status.ContainerStatuses to be populated
    e2e_toomanyrestarts_test.go:105: Pod restart count not as expected
--- FAIL: TestTooManyRestarts (5.19s)

...
=== RUN   TestTopologySpreadConstraint/test-rc-topology-spread-soft-constraint
    e2e_topologyspreadconstraint_test.go:59: Creating RC test-rc-topology-spread-soft-constraint with 4 replicas
    e2e_test.go:1010: Waiting for 4 pods to be created, got 0 instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1015: Pod test-rc-topology-spread-soft-constraint-27t6z not running yet, is Pending instead
    e2e_test.go:1021: Error waiting for pods running: timed out waiting for the condition
    e2e_test.go:1064: Waiting for test-rc-topology-spread-soft-constraint RC pods to disappear, still 4 remaining
    e2e_test.go:1064: Waiting for test-rc-topology-spread-soft-constraint RC pods to disappear, still 0 remaining
--- FAIL: TestTopologySpreadConstraint (72.30s)
    --- FAIL: TestTopologySpreadConstraint/test-rc-topology-spread-hard-constraint (36.12s)
    --- FAIL: TestTopologySpreadConstraint/test-rc-topology-spread-soft-constraint (36.06s)

JaneLiuL avatar Mar 11 '22 12:03 JaneLiuL

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: JaneLiuL To complete the pull request process, please assign seanmalloy after the PR has been reviewed. You can assign the PR to them by writing /assign @seanmalloy in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Mar 11 '22 12:03 k8s-ci-robot

/hold

JaneLiuL avatar Mar 11 '22 12:03 JaneLiuL

i plan to use watcher as below before, but it seems too complicate on e2e , so just adjust the timeout on TestTopologySpreadConstraint, and also not return if pod still pending or len(podList.Items[0].Status.ContainerStatuses) < 1 need to sleep and continue check on TestTooManyRestarts

podNameSelectorStr := fmt.Sprintf("metadata.name=%s", pod.Name)
			watcher, err := clientSet.CoreV1().Pods(pod.Namespace).Watch(ctx, metav1.ListOptions{
				FieldSelector: podNameSelectorStr,
			})
			if err != nil {
				t.Fatalf("Watch pod fail %v", err)
				return false, err
			}
			timeout := time.After(2 * time.Minute)
			for {
				select {
				case <-timeout:
					watcher.Stop()
					t.Log("Timeout, pod still not running as expected")
					return false, fmt.Errorf("timeout Error")
				case event, ok := <-watcher.ResultChan():
					if ok {
						switch event.Type {
						case watch.Modified:
							if updatePod, ok := event.Object.(*v1.Pod); ok {
								if updatePod.Status.Phase == v1.PodRunning {
									break
								}
							}
						}
					}
				}
			}

JaneLiuL avatar Mar 11 '22 14:03 JaneLiuL

/hold cancel

JaneLiuL avatar Mar 11 '22 14:03 JaneLiuL

/retest

damemi avatar May 12 '22 18:05 damemi

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 10 '22 19:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Sep 09 '22 19:09 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Oct 09 '22 20:10 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 09 '22 20:10 k8s-ci-robot