fix: skip cancellation if no config found
Skip flow-run cancellation if there is no infrastructure cancellation found. This prevents an unbound variable from being referenced.
Fixes issues where the deployment has been deleted before the flow-run can be cancelled (such as in https://prefect-community.slack.com/archives/CL09KU1K7/p1708020616928889)
Checklist
- [ ] This pull request references any related issue by including "closes
<link to issue>"- If no issue exists and your change is not a small fix, please create an issue first.
- [ ] This pull request includes tests or only affects documentation.
- [ ] This pull request includes a label categorizing the change e.g.
maintenance,fix,feature,enhancement,docs.
For documentation changes:
- [ ] This pull request includes redirect settings in
netlify.tomlfor files that are removed or renamed.
For new functions or classes in the Python SDK:
- [ ] This pull request includes helpful docstrings.
- [ ] If a new Python file was added, this pull request contains a stub page in the Python SDK docs and an entry in
mkdocs.ymlnavigation.
Deploy Preview for prefect-docs-preview ready!
| Name | Link |
|---|---|
| Latest commit | 25eeaa8ab610cb7790b75c4b55a114eab8cf1a51 |
| Latest deploy log | https://app.netlify.com/sites/prefect-docs-preview/deploys/6607305186c8c600082ce81f |
| Deploy Preview | https://deploy-preview-12001--prefect-docs-preview.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
@desertaxle can I get your thoughts on this? The bug in this case happens when our flow-run has an infrastructure_pid but the configuration object isn't found. I think it makes sense to manually mark the flow-run as cancelled if we can't find the configuration but I can imagine a scenario where certain types of workers can handle a missing configuration object differently. An alternative could be to push an empty config object into the worker's kill_infrastructure method.
@urimandujano this change may not work well in the scenario where a runner is managing the flow run. The worker marking the flow run as CANCELLED may prevent a runner from shutting down a flow run because it looks for the CANCELLING state for flow runs it manages.
That would only apply if the deployment was deleted while the flow run is still executing or in cancelling. We generally fail pretty spectacularly when a deployment is deleted while a flow run is executing.
I think this is a reasonable solution, but it might also be worth looking into whether we should prevent a deployment from being deleted if there is an active flow run.
This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment.
I'm going to go ahead and mark this as ready for review since I'm pretty sure it'll prevent the specific case the user was running into (a worker swamped with logs related to missing configuration). I'll open an issue for digging into what it would look like to prevent deleting a deployment if it has active flow runs.
This is the new issue for tracking the enhancement in behavior.
This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment.