status.total counter is not correct for openshift/conformance suite
The field total from status is not correct on the openshift/conformance suite (default, parallel).
The problem was found when running the OPCT on the latest release. The OPCT is built on top of openshift-tests binary, and consumes that counter to report the execution to the user when running the tool. More details is available here: https://issues.redhat.com/browse/SPLAT-696
Version
$ oc version
Client Version: 4.10.10
Server Version: 4.11.0
Kubernetes Version: v1.24.0+9546431
Steps To Reproduce
openshift-tests run openshift/conformance- Wait for 1127th test
- Check if the total keep increasing with the index, second field [
(failed/index/total)] of status
Current Result
after the 1127th test, the total counter keeps increasing with the index:
openshift-tests version: 4.11.0-202208020706.p0.gb860532.assembly.stream-b860532
Starting SimultaneousPodIPControllerI0809 16:31:15.790490 3733 shared_informer.go:255] Waiting for caches to sync for SimultaneousPodIPController
started: (0/1/1127) "[sig-scheduling][Early] The openshift-monitoring pods should be scheduled on different nodes [Suite:openshift/conformance/parallel]"
(...)
started: (0/1126/1127) "[sig-storage] PersistentVolumes-expansion loopback local block volume should support online expansion on node [Suite:openshift/conformance/parallel] [Suite:k8s]"
passed: (38s) 2022-08-09T17:12:21 "[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options [Suite:openshift/conformance/parallel] [Suite:k8s]"
started: (0/1127/1127) "[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: tmpfs] [Testpattern: Generic Ephemeral-volume (block volmode) (late-binding)] ephemeral should support two pods which have the same volume definition [Suite:openshift/conformance/parallel] [Suite:k8s]"
passed: (6.6s) 2022-08-09T17:12:21 "[sig-storage] Downward API volume should provide container's memory request [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"
started: (0/1128/1128) "[sig-storage] In-tree Volumes [Driver: cinder] [Testpattern: Dynamic PV (immediate binding)] topology should fail to schedule a pod which has topologies that conflict with AllowedTopologies [Suite:openshift/conformance/parallel] [Suite:k8s]"
skip [k8s.io/[email protected]/test/e2e/storage/framework/testsuite.go:116]: Driver local doesn't support GenericEphemeralVolume -- skipping
Ginkgo exit error 3: exit with code 3
skipped: (400ms) 2022-08-09T17:12:21 "[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: tmpfs] [Testpattern: Generic Ephemeral-volume (block volmode) (late-binding)] ephemeral should support two pods which have the same volume definition [Suite:openshift/conformance/parallel] [Suite:k8s]"
started: (0/1129/1129) "[sig-storage] In-tree Volumes [Driver: emptydir] [Testpattern: Dynamic PV (default fs)] capacity provides storage capacity information [Suite:openshift/conformance/parallel] [Suite:k8s]"
After that, it keeps increasing until the last test (3475th):
started: (30/3474/3474) "[sig-arch][bz-etcd][Late] Alerts alert/etcdGRPCRequestsSlow should not be at or above pending [Suite:openshift/conformance/parallel]"
passed: (4.5s) 2022-08-09T18:26:40 "[sig-arch][bz-Unknown][Late] Alerts alert/KubePodNotReady should not be at or above info in all the other namespaces [Suite:openshift/conformance/parallel]
"
started: (30/3475/3475) "[sig-arch][bz-Unknown][Late] Alerts alert/KubePodNotReady should not be at or above pending in ns/default [Suite:openshift/conformance/parallel]"
Expected Result
started: (0/1/3475) (....)
Additional Information
Extracting the openshift-tests from the same release the cluster is running, I got a different counter:
$ ./.local/bin/openshift-install-linux-4.11.0 version
./.local/bin/openshift-install-linux-4.11.0 4.11.0
built from commit 37684309bcb598757c99d3ea9fbc0758343d64a5
release image quay.io/openshift-release-dev/ocp-release@sha256:300bce8246cf880e792e106607925de0a404484637627edf5f517375517d54a4
release architecture amd64
$ RELEASE_IMAGE=$(./.local/bin/openshift-install-linux-4.11.0 version | awk '/release image/ {print $3}')
$ TESTS_IMAGE=$(oc adm release info --image-for='tests' $RELEASE_IMAGE)
$ oc image extract $TESTS_IMAGE --file="/usr/bin/openshift-tests" -a ~/.openshift/pull-secret-latest.json
$ chmod u+x openshift-tests
$ ./openshift-tests run --dry-run openshift/conformance |wc -l
3487
we talked about this issue during the install flex sync meeting today, we don't think this is overly concerning but it will be an issue for people who want to monitor the count as it's happening. it will be difficult to determine when the tests will end.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten /remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen. Mark the issue as fresh by commenting/remove-lifecycle rotten. Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.