ratify
ratify copied to clipboard
Azure test cleanup runs before tests complete
What happened in your environment?
The Azure e2e tests fail on the main branch intermittently. The new test clean up stage seems to run concurrently to the actual test leading to resource group being prematurely deleted
What did you expect to happen?
No response
What version of Kubernetes are you running?
No response
What version of Ratify are you running?
No response
Anything else you would like to add?
No response
Are you willing to submit PRs to contribute to this bug fix?
- [ ] Yes, I am willing to implement it.
do you mean the test clean up stage
caused the intermittent failure? I didn't see failures in the recent builds, guess you re-ran failed tests.
@binbin-li It seems like the test cleanup stage might run in parallel to the actual test job. Or it doesn't wait for all jobs to complete. Take this run as an example: https://github.com/deislabs/ratify/actions/runs/5979145301
The errors point to the resource already being deleted. I'm also curious what about scenarios where multiple AKS test are running in parallel. The resource cleanup will clean up all resource groups prefixed.
I'm suspecting that it was deleted by a previous action run on the main branch.
The delete job was just deleting resources tagged by
ratifye2e
, and the prev job looks just finished a few minutes ahead.
As for the multiple AKS test running, the cleanup stage would delete resources across tests as you mentioned. This is a limitation to the current setup as it was designed to delete previous resources as well.
I'm suspecting that it was deleted by a previous action run on the main branch.
The delete job was just deleting resources tagged by
ratifye2e
, and the prev job looks just finished a few minutes ahead. As for the multiple AKS test running, the cleanup stage would delete resources across tests as you mentioned. This is a limitation to the current setup as it was designed to delete previous resources as well.
Ahh yes good point. That's likely it. We'll probably run into this issue only for multiple AKS build-pr runs. In this case, I made the release branch right after the PR merged for chart updates leading to the conflict. I'm hoping this is not very common