Functional tests fail due to scan timeout
We're seeing the following issue crop up consistently in CI:
2024/07/22 19:38:13 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
2024/07/22 19:38:18 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
2024/07/22 19:38:19 waiting until suite test-scan-has-profile-g-u-i-d reaches target status 'DONE'. Current status: RUNNING
main_test.go:329: timed out waiting for the condition
--- FAIL: TestScanHasProfileGUID (1800.65s)
=== RUN TestMixProductScan
In this case, the TestScanHasProfileGUID test failed because the scan involved in the test didn't complete within a 30 minute timeout:
https://github.com/ComplianceAsCode/compliance-operator/blob/master/pkg/apis/compliance/v1alpha1/compliancescan_types.go#L218-L221
We should find a way to make this more resilient so that we don't need to recheck as many jobs to get patches merged.
An ocp4-stig-node scan ran out of time recently:
https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/54636/rehearse-54636-pull-ci-ComplianceAsCode-content-master-4.12-e2e-aws-ocp4-stig-node/1815389970232250368
That's interesting, since the content testing uses a separate E2E runner (ComplianceAsCode/ocp4e2e) than the functional testing we have in the ComplianceAsCode/compliance-operator.
That could mean we actually have a bug in the operator somewhere that manifests as a bricked scan.