[backfill daemon run retries 1/n] update how we determine backfill completion to account for retried runs
Summary & Motivation
The backfill daemon doesn't account for run retries. See https://github.com/dagster-io/internal/discussions/12460 for more context. We've decided that we want the daemon to account for automatic and manual retries of runs that occur while the backfill is still in progress. This requires two changes: ensuring the backfill isn't marked completed if there is an in progress run or a failed run that will be automatically retried; and updating the daemon to take the results of retried runs into account when deciding what partitions to materialize in the next iteration.
This PR addresses the first point, ensuring the backfill isn't marked completed if there is an in progress run or a failed run that will be automatically retried.
Currently a backfill is marked complete when all targeted asset partitions are in a terminal state (successfully materialized, failed, or downstream of a failed partition). Since failed runs may be retried, there is a case where all asset partitions are in a terminal state, but there is a retry in progress that could change the state of some asset partitions. This means that if there are any runs in progress for the partition we need to wait for them to complete before marking the backfill complete.
Additionally, we need to account for a race condition where a failed run may have a retry automatically launched for it, but the daemon marks the backfill complete before the retried run is queued. This PR adds an additional check to ensure that no failed runs are about to be retried.
How I Tested These Changes
new unit tests
-
#25900
-
#25853
-
#25771
👈 (View in Graphite)
-
#26054
-
#25932
-
#26046
-
master
This stack of pull requests is managed by Graphite. Learn more about stacking.
Deploy preview for dagster-university ready!
✅ Preview https://dagster-university-5hid137gc-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster-university.dagster-docs.io
Built with commit bf580e462f146e09ae9f22ed28b4e1390c048fcf. This pull request is being automatically deployed with vercel-action
Deploy preview for dagster-docs ready!
Preview available at https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io
Direct link to changed pages:
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/concepts/assets/asset-checks/define-execute-asset-checks
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/concepts/metadata-tags/kind-tags
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/concepts/testing
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/guides/migrations/from-step-launchers-to-pipes
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/guides/running-dagster-locally
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/airlift
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/airlift/reference
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/airlift/tutorial
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/looker
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/spark
- https://dagster-docs-4a8ut7u30-elementl.vercel.app https://jamie-backfill-daemon-termination-change.dagster.dagster-docs.io/integrations/tableau
Deploy preview for dagit-storybook ready!
✅ Preview https://dagit-storybook-kx2qczuyi-elementl.vercel.app https://jamie-backfill-daemon-termination-change.components-storybook.dagster-docs.io
Built with commit bf580e462f146e09ae9f22ed28b4e1390c048fcf. This pull request is being automatically deployed with vercel-action
Deploy preview for dagit-core-storybook ready!
✅ Preview https://dagit-core-storybook-pw23urgfv-elementl.vercel.app https://jamie-backfill-daemon-termination-change.core-storybook.dagster-docs.io
Built with commit bf580e462f146e09ae9f22ed28b4e1390c048fcf. This pull request is being automatically deployed with vercel-action
@clairelin135 @gibsondan bumping for review (and the stacked pr)!
@gibsondan this is ready for review now that the retry tag changes have landed!