operator-sdk
operator-sdk copied to clipboard
Operator-sdk scorecard basic spec check occasionally produces timeout with no logs
Bug Report
operator-sdk v.1.14.0
scorecard-test v1.14.0
I'm from the DCI team. As a part of our daily cert suite, we regularly run operator-sdk scorecard --selector=test=basic-check-spec-test for two operators: simple-demo-operator and testpmd-operator, 10 tests per day.
Normally, basic-check-spec-test should be green for both. But it fails occasionally in 20% of cases for both operators in a row with timeout error error running tests context deadline exceeded. To increase timeout with wait-time up to 300s doesn't help. Also, the test is always failing for both operators in a row and it looks like a 10-20 min of some internal API.
To have more information, could you please add more information in the logs about where did exactly the timeout happen?
What did you do?
operator_sdk scorecard \
--output json \
--selector=test=basic-check-spec-test \
--kubeconfig {{ kubeconfig_path }} \
--namespace scorecard-testing \
--service-account default \
--config {{ scorecard_config_path }} \
--verbose \
--wait-time 300s \
{{ scorecard_operator_dir }}
What did you expect to see?
The results of basic-check-spec-test should be stable. In the case of timeout, it would be nice to have logs to identify what is the reason for this timeout.
What did you see instead? Under which circumstances?
Timeout in 20% of cases error running tests context deadline exceeded with no detailed logs.
Environment
Operator type:
-
name: "testpmd-operator"
version: "v0.2.9"
image: "quay.io/rh-nfv-int/testpmd-operator-bundle@sha256:5e28f883faacefa847104ebba1a1a22ee897b7576f0af6b8253c68b5c8f42815"
index_image: "quay.io/tkrishtop/index-testpmd-operator-bundle:v0.2.9"
-
name: "simple-demo-operator"
version: "v0.0.3"
image: "quay.io/opdev/simple-demo-operator-bundle@sha256:eff7f86a54ef2a340dbf739ef955ab50397bef70f26147ed999e989cfc116b79"
index_image: "quay.io/opdev/simple-demo-operator-catalog:v0.0.3"
Kubernetes cluster type:
Happens randomly for the latest stable OCP 4.7, OCP 4.8, OCP 4.9, OCP 4.10
$ operator-sdk version
operator-sdk v.1.14.0
scorecard-test v1.14.0
Possible Solution
It would be nice to have more detailed logs to identify what is the reason for this timeout.
FTR, I encountered a failure recently with verbose logging turned on. AFAICT the interesting part (i.e. what it is that's actually timing out) is not getting logged, but posting here in case it's useful to someone investigating this.
/go/src/github.com/stackrox/rox/operator//bin/operator-sdk-1.14.0 scorecard --storage-image="quay.io/rhacs-eng/mirror-busybox:@sha256:fe9664ceda9c13b7fd65f5bc3c0bca5060de3a66fedc39d33926b573599da3b7" --wait-time="40s" --verbose quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404
DEBU[0000] Debug logging is set
DEBU[0000] resolving host=quay.io
DEBU[0000] do request host=quay.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=opm/alpha request.method=HEAD url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/v3.69.0-nightly-20220404"
DEBU[0000] fetch response received host=quay.io response.header.connection=keep-alive response.header.content-length=112 response.header.content-type=application/json response.header.date="Mon, 04 Apr 2022 05:58:50 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server=nginx/1.14.1 response.header.www-authenticate="Bearer realm=\"https://quay.io/v2/auth\",service=\"quay.io\",scope=\"repository:rhacs-eng/stackrox-operator-bundle:pull\"" response.status="401 UNAUTHORIZED" url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/v3.69.0-nightly-20220404"
DEBU[0000] Unauthorized header="Bearer realm=\"https://quay.io/v2/auth\",service=\"quay.io\",scope=\"repository:rhacs-eng/stackrox-operator-bundle:pull\"" host=quay.io
DEBU[0000] do request host=quay.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=opm/alpha request.method=HEAD url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/v3.69.0-nightly-20220404"
DEBU[0000] fetch response received host=quay.io response.header.connection=keep-alive response.header.content-length=940 response.header.content-type=application/vnd.docker.distribution.manifest.v2+json response.header.date="Mon, 04 Apr 2022 05:58:50 GMT" response.header.docker-content-digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" response.header.server=nginx/1.14.1 response.header.strict-transport-security="max-age=63072000; preload" response.header.x-frame-options=DENY response.status="200 OK" url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/v3.69.0-nightly-20220404"
DEBU[0000] resolved desc.digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" host=quay.io
DEBU[0000] resolved name: quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404 bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] fetched bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" dir=bundle-421604376
DEBU[0000] {application/vnd.docker.distribution.manifest.v2+json sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12 940 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] fetch digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" mediatype=application/vnd.docker.distribution.manifest.v2+json size=940
DEBU[0000] do request digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" mediatype=application/vnd.docker.distribution.manifest.v2+json request.header.accept="application/vnd.docker.distribution.manifest.v2+json, */*" request.header.user-agent=opm/alpha request.method=GET size=940 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12"
DEBU[0000] fetch response received digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" mediatype=application/vnd.docker.distribution.manifest.v2+json response.header.connection=keep-alive response.header.content-length=940 response.header.content-type=application/vnd.docker.distribution.manifest.v2+json response.header.date="Mon, 04 Apr 2022 05:58:50 GMT" response.header.docker-content-digest="sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12" response.header.server=nginx/1.14.1 response.header.strict-transport-security="max-age=63072000; preload" response.header.x-frame-options=DENY response.status="200 OK" size=940 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/manifests/sha256:1f8462f55f91968c31cf86d54b6e9c2f8ef2e8144f108d095aaeb0a0929e9d12"
DEBU[0000] fetched bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" digest="sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8" dir=bundle-421604376
DEBU[0000] {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8 480 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] fetch digest="sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip size=480
DEBU[0000] fetched bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" digest="sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258" dir=bundle-421604376
DEBU[0000] fetched bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" digest="sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b" dir=bundle-421604376
DEBU[0000] {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258 29090 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] {application/vnd.docker.container.image.v1+json sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b 5623 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] fetch digest="sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b" mediatype=application/vnd.docker.container.image.v1+json size=5623
DEBU[0000] fetch digest="sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip size=29090
DEBU[0000] fetched bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" digest="sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097" dir=bundle-421604376
DEBU[0000] {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097 406 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] fetch digest="sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip size=406
DEBU[0000] do request digest="sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, */*" request.header.user-agent=opm/alpha request.method=GET size=480 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8"
DEBU[0000] do request digest="sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b" mediatype=application/vnd.docker.container.image.v1+json request.header.accept="application/vnd.docker.container.image.v1+json, */*" request.header.user-agent=opm/alpha request.method=GET size=5623 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b"
DEBU[0000] do request digest="sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, */*" request.header.user-agent=opm/alpha request.method=GET size=29090 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258"
DEBU[0000] do request digest="sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, */*" request.header.user-agent=opm/alpha request.method=GET size=406 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097"
DEBU[0000] fetch response received digest="sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip response.header.accept-ranges=bytes response.header.content-length=480 response.header.content-type=binary/octet-stream response.header.date="Mon, 04 Apr 2022 05:58:51 GMT" response.header.etag="\"51f66689c707be0494d193fc7650a5f9-1\"" response.header.last-modified="Mon, 04 Apr 2022 05:28:28 GMT" response.header.server=AmazonS3 response.header.x-amz-id-2="3dW6VmlVBdHl6fMvifIGpq0dWEcd8kDG9cpMSH9+QJzBpTB2Bd9QMnX5lAqBktyjbZluGbwpgNo=" response.header.x-amz-replication-status=COMPLETED response.header.x-amz-request-id=JKTGPACKW93BBZ6C response.header.x-amz-server-side-encryption=AES256 response.header.x-amz-version-id=hawtuRC76z9_U20jVHGUYfpHW8MdEeIH response.status="200 OK" size=480 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8"
DEBU[0000] fetch response received digest="sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b" mediatype=application/vnd.docker.container.image.v1+json response.header.accept-ranges=bytes response.header.content-length=5623 response.header.content-type=binary/octet-stream response.header.date="Mon, 04 Apr 2022 05:58:51 GMT" response.header.etag="\"1b7f8046facca76d808acb9350446a89-1\"" response.header.last-modified="Mon, 04 Apr 2022 05:28:29 GMT" response.header.server=AmazonS3 response.header.x-amz-id-2="q4gHx9LgX6QvEl6RM0H2DKjDo6z+cIYIySZnWBvbOMaDDbARUuM0d74aYbfgE7yri71i1WX71RI=" response.header.x-amz-replication-status=COMPLETED response.header.x-amz-request-id=JKTNA878PQ0CN6KH response.header.x-amz-server-side-encryption=AES256 response.header.x-amz-version-id=xY5B4I9EAaNpprD1lCoha71KhEvvS.u_ response.status="200 OK" size=5623 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:85dc11d7744b426c884c46a03f87365e54ed43477a62dbcc6cb2737616e10e7b"
DEBU[0000] fetch response received digest="sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip response.header.accept-ranges=bytes response.header.content-length=406 response.header.content-type=binary/octet-stream response.header.date="Mon, 04 Apr 2022 05:58:51 GMT" response.header.etag="\"b041299f20acd151b6be2316d966ae21-1\"" response.header.last-modified="Mon, 04 Apr 2022 05:28:28 GMT" response.header.server=AmazonS3 response.header.x-amz-id-2="R9MCOj3SrFzgDT5d7Snp0WlhgkPky19ior+lpfV7KVbpjq0/Vcfwe7LnJBS3qaSmfbLXW9JtnCk=" response.header.x-amz-replication-status=COMPLETED response.header.x-amz-request-id=JKTTWG8J16AQG14R response.header.x-amz-server-side-encryption=AES256 response.header.x-amz-version-id=i7UK3EsZch9IgD6o0OGoB5bMxBwlu7hr response.status="200 OK" size=406 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097"
DEBU[0000] fetch response received digest="sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258" mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip response.header.accept-ranges=bytes response.header.content-length=29090 response.header.content-type=binary/octet-stream response.header.date="Mon, 04 Apr 2022 05:58:51 GMT" response.header.etag="\"a78a659753af9dcca852fab61394e9ba-1\"" response.header.last-modified="Mon, 04 Apr 2022 05:28:28 GMT" response.header.server=AmazonS3 response.header.x-amz-id-2="NU0J2CLLlW4jq5J1tarRPv4iBDw1CfMMl6FlbCRq75SQz3uL/OkXX4rJJO+XDfQEIjB4SASaZaM=" response.header.x-amz-replication-status=COMPLETED response.header.x-amz-request-id=JKTNTDYNJ7QT3XT9 response.header.x-amz-server-side-encryption=AES256 response.header.x-amz-version-id=XqQhNIlRTb3aOb8QsMDVQtwbiAg3buHp response.status="200 OK" size=29090 url="https://quay.io/v2/rhacs-eng/stackrox-operator-bundle/blobs/sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258"
DEBU[0000] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:7e654a7e8519f043e19d3f91b5709b424dc06c0c7787ee8cfd4a0eb14ad63258 29090 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] unpigz not found, falling back to go gzip error="exec: \"unpigz\": executable file not found in $PATH"
DEBU[0000] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:80583ed86a84ec80184a5bb7917a04b6d568b5ff1464f202986f7629b84c0097 406 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
DEBU[0000] unpacking layer: {application/vnd.docker.image.rootfs.diff.tar.gzip sha256:82023726a6e890b5b0db920e4eea2841c8557f931994c36a39b1b5a4b76a9dd8 480 [] map[] <nil>} bundle="quay.io/rhacs-eng/stackrox-operator-bundle:v3.69.0-nightly-20220404" dir=bundle-421604376
Error: error running tests context deadline exceeded
Usage:
operator-sdk scorecard [flags]
Flags:
-c, --config string path to scorecard config file
-h, --help help for scorecard
--kubeconfig string kubeconfig path
-L, --list Option to enable listing which tests are run
-n, --namespace string namespace to run the test images in
-o, --output string Output format for results. Valid values: text, json, xunit (default "text")
-l, --selector string label selector to determine which tests are run
-s, --service-account string Service account to use for tests (default "default")
-x, --skip-cleanup Disable resource cleanup after tests are run
-b, --storage-image string Storage image to be used by the Scorecard pod (default "docker.io/library/busybox@sha256:c71cb4f7e8ececaffb34037c2637dc86820e4185100e18b4d02d613a9bd772af")
-t, --test-output string Test output directory. (default "test-output")
-u, --untar-image string Untar image to be used by the Scorecard pod (default "registry.access.redhat.com/ubi8@sha256:910f6bc0b5ae9b555eb91b88d28d568099b060088616eba2867b07ab6ea457c7")
-w, --wait-time duration seconds to wait for tests to complete. Example: 35s (default 30s)
Global Flags:
--plugins strings plugin keys to be used for this subcommand execution
--verbose Enable verbose logging
FATA[0040] error running tests context deadline exceeded
make: *** [Makefile:401: bundle-test-image] Error 1
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
On Tue, 4 Oct 2022 at 03:01, OpenShift Bot @.***> wrote:
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
— Reply to this email directly, view it on GitHub https://github.com/operator-framework/operator-sdk/issues/5452#issuecomment-1266270441, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXPTAQ6KM5PFD6GPDGGULWBN6V7ANCNFSM5J5U4XVQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hello team, that's an old issue but we keep seeing it regularly while running Preflight which currently runs operator-sdk 1.22.2.
Here are the logs from preflight.log (BasicSpecCheck.json is not created at all):
time="2022-11-21T02:42:04Z" level=debug msg="running check: ScorecardBasicSpecCheck"
time="2022-11-21T02:42:04Z" level=trace msg="Running operator-sdk scorecard check for quay.io/rh-nfv-int/testpmd-operator-bundle@sha256:5e28f883faacefa847104ebba1a1a22ee897b7576f0af6b8253c68b5c8f42815"
time="2022-11-21T02:42:04Z" level=trace msg="--selector=[test=basic-check-spec-test]"
time="2022-11-21T02:42:04Z" level=debug msg="Using quay.io/operator-framework/scorecard-test@sha256:b939bdf25de65d0c4256ac799b02a58335b705e24cbdd66d61e344982eb6156d as the scorecard test image"
time="2022-11-21T02:42:04Z" level=trace msg="running scorecard with the following invocation[operator-sdk scorecard --output json --selector=test=basic-check-spec-test --kubeconfig /kubeconfig --wait-time 240s --namespace preflight-testing --service-account default --config /tmp/scorecard-test-config-4476853.yaml --verbose /tmp/preflight-3276679150/fs]"
time="2022-11-21T02:46:04Z" level=debug msg="operator-sdk scorecard failed to run properly."
time="2022-11-21T02:46:04Z" level=debug msg="stderr: time=\"2022-11-21T02:42:04Z
\" level=debug msg=\"Debug logging is set
\"\nError: error running tests context deadline exceeded
\nUsage:\n operator-sdk scorecard [flags]\n\n
Flags:\n -c, --config string path to scorecard config file\n -h, --help help for scorecard\n --kubeconfig string kubeconfig path\n -L, --list
Option to enable listing which tests are run\n -n, --namespace string namespace to run the test images in\n -o, --output string
Output format for results. Valid values: text, json, xunit (default \"text\")\n -l, --selector string label selector to determine which tests are run\n -s, --service-account string Service account to use for tests (default \"default\")\n -x, --skip-cleanup
Disable resource cleanup after tests are run\n -b, --storage-image string Storage image to be used by the Scorecard pod
(default \"docker.io/library/busybox@sha256:c71cb4f7e8ececaffb34037c2637dc86820e4185100e18b4d02d613a9bd772af\")\n -t, --test-output string
Test output directory. (default \"test-output\")\n -u, --untar-image string
Untar image to be used by the Scorecard pod (default \"registry.access.redhat.com/ubi8@sha256:910f6bc0b5ae9b555eb91b88d28d568099b060088616eba2867b07ab6ea457c7\")\n -w, --wait-time duration seconds to wait for tests to complete.
Example: 35s (default 30s)\n\n
Global Flags:\n --plugins strings plugin keys to be used for this subcommand execution\n --verbose Enable verbose logging\n\ntime=\"2022-11-21T02:46:04Z\" level=fatal msg=\"error running tests context deadline exceeded\"\n"
time="2022-11-21T02:46:04Z" level=info msg="check completed: ScorecardBasicSpecCheck" err="failed to run operator-sdk scorecard: exit status 1" result=ERROR
Now we have more logs (thank you team!) and see that the timeout happens here: Untar image to be used by the Scorecard pod https://github.com/operator-framework/operator-sdk/blob/5d541d01b5d9d173a6927faf038a781f845b1003/internal/cmd/operator-sdk/scorecard/cmd.go#L96-L98
@asmacdo @theishshah do you have any clues?
cc: @acornett21
Upd. In half of such situations, the scorecard-test pod is stuck in the namespace, failing to terminate. This last bad thing should be easy to fix, whatever the reason for the initial issue.
oc get all -n preflight-testing
NAME READY STATUS RESTARTS AGE
pod/scorecard-test-tjbf 0/1 Terminating 0 14m
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten /remove-lifecycle stale
/remove-lifecycle stale
/remove-lifecycle rotten
Hi @porridge, I noticed that you are using a very old version 1.14.0. On our side, migration to 1.26.0 significantly reduced the frequency of timeouts.
@tkrishtop we used 1.14 a year ago, when I wrote that comment :-) These days we just retry the command 4 times and we've never seen the error since. I guess the cause was long image pulls, but it would be great to confirm this by actually having a useful error message.
@porridge I managed to trace down the error. The timeout happens here: Untar image to be used by the Scorecard pod, here is the source code:
https://github.com/operator-framework/operator-sdk/blob/5d541d01b5d9d173a6927faf038a781f845b1003/internal/cmd/operator-sdk/scorecard/cmd.go#L96-L98
cc: @theishshah
@tkrishtop Recently operator-sdk switched to using their own images for this cmd. So instead of using an image from docker.io and an image from registry.access.redhat.com both images come from quay. This change is in 1.28.0 you can see the PR here. Have you upgraded to 1.28.0 yet?
Hi @acornett21, thank you for the information! Typically, we run the scorecard-sdk from preflight, which currently uses version 1.26.0. However, I do have the option to run the standalone scorecard-sdk, so perhaps I'll activate it for our daily runs using version 1.28.0.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten /remove-lifecycle stale
/remove-lifecycle rotten
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale