compliance-operator icon indicating copy to clipboard operation
compliance-operator copied to clipboard

Functional tests fail due to scan timeout

Open rhmdnd opened this issue 1 year ago • 2 comments

We're seeing the following issue crop up consistently in CI:

2024/07/22 19:38:13 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
2024/07/22 19:38:18 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
2024/07/22 19:38:19 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
    main_test.go:329: timed out waiting for the condition
--- FAIL: TestScanHasProfileGUID (1800.65s)
=== RUN   TestMixProductScan

In this case, the TestScanHasProfileGUID test failed because the scan involved in the test didn't complete within a 30 minute timeout:

https://github.com/ComplianceAsCode/compliance-operator/blob/master/pkg/apis/compliance/v1alpha1/compliancescan_types.go#L218-L221

We should find a way to make this more resilient so that we don't need to recheck as many jobs to get patches merged.

rhmdnd avatar Jul 23 '24 15:07 rhmdnd

An ocp4-stig-node scan ran out of time recently: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/54636/rehearse-54636-pull-ci-ComplianceAsCode-content-master-4.12-e2e-aws-ocp4-stig-node/1815389970232250368

yuumasato avatar Jul 23 '24 15:07 yuumasato

That's interesting, since the content testing uses a separate E2E runner (ComplianceAsCode/ocp4e2e) than the functional testing we have in the ComplianceAsCode/compliance-operator.

That could mean we actually have a bug in the operator somewhere that manifests as a bricked scan.

rhmdnd avatar Jul 23 '24 16:07 rhmdnd