prefect icon indicating copy to clipboard operation
prefect copied to clipboard

fix: skip cancellation if no config found

Open urimandujano opened this issue 1 year ago • 3 comments

Skip flow-run cancellation if there is no infrastructure cancellation found. This prevents an unbound variable from being referenced.

Fixes issues where the deployment has been deleted before the flow-run can be cancelled (such as in https://prefect-community.slack.com/archives/CL09KU1K7/p1708020616928889)

Checklist

  • [ ] This pull request references any related issue by including "closes <link to issue>"
    • If no issue exists and your change is not a small fix, please create an issue first.
  • [ ] This pull request includes tests or only affects documentation.
  • [ ] This pull request includes a label categorizing the change e.g. maintenance, fix, feature, enhancement, docs.

For documentation changes:

  • [ ] This pull request includes redirect settings in netlify.toml for files that are removed or renamed.

For new functions or classes in the Python SDK:

  • [ ] This pull request includes helpful docstrings.
  • [ ] If a new Python file was added, this pull request contains a stub page in the Python SDK docs and an entry in mkdocs.yml navigation.

urimandujano avatar Feb 15 '24 22:02 urimandujano

Deploy Preview for prefect-docs-preview ready!

Name Link
Latest commit 25eeaa8ab610cb7790b75c4b55a114eab8cf1a51
Latest deploy log https://app.netlify.com/sites/prefect-docs-preview/deploys/6607305186c8c600082ce81f
Deploy Preview https://deploy-preview-12001--prefect-docs-preview.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify[bot] avatar Feb 15 '24 22:02 netlify[bot]

@desertaxle can I get your thoughts on this? The bug in this case happens when our flow-run has an infrastructure_pid but the configuration object isn't found. I think it makes sense to manually mark the flow-run as cancelled if we can't find the configuration but I can imagine a scenario where certain types of workers can handle a missing configuration object differently. An alternative could be to push an empty config object into the worker's kill_infrastructure method.

urimandujano avatar Feb 15 '24 22:02 urimandujano

@urimandujano this change may not work well in the scenario where a runner is managing the flow run. The worker marking the flow run as CANCELLED may prevent a runner from shutting down a flow run because it looks for the CANCELLING state for flow runs it manages.

That would only apply if the deployment was deleted while the flow run is still executing or in cancelling. We generally fail pretty spectacularly when a deployment is deleted while a flow run is executing.

I think this is a reasonable solution, but it might also be worth looking into whether we should prevent a deployment from being deleted if there is an active flow run.

desertaxle avatar Feb 15 '24 22:02 desertaxle

This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment.

github-actions[bot] avatar Mar 01 '24 18:03 github-actions[bot]

I'm going to go ahead and mark this as ready for review since I'm pretty sure it'll prevent the specific case the user was running into (a worker swamped with logs related to missing configuration). I'll open an issue for digging into what it would look like to prevent deleting a deployment if it has active flow runs.

This is the new issue for tracking the enhancement in behavior.

urimandujano avatar Mar 06 '24 20:03 urimandujano

This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment.

github-actions[bot] avatar Mar 20 '24 21:03 github-actions[bot]