dagster
dagster copied to clipboard
[dagster-airlift][rfc] Constrain proxy operator to one run
Summary & Motivation
Right now, we execute all runs in parallel for the proxy operator, and blindly choose the first job we come across for an asset. This causes incorrect behavior in older versions of Dagster, where we aren't guaranteed to have an implicit asset job, and also causes problems if there are user-created jobs that we choose instead of the implicit asset job. It can also cause problems if the assets for a single task exist across code locations, as there can be inter-asset dependencies within that code location.
This PR prototypes making two simplifying assumptions that allow us to ensure these weird cases aren't happening:
- Dagster is version 1.8 or greater
- All assets mapped to a given task exist within the same code location.
These two assumptions basically allow us to ensure that the assets are all materializable within the same job, which means the client doesn't need to handle figuring out the topological ordering across jobs.
We can relax the 1.8 constraint by allowing the user to override which job we attempt to materialize to be a job of their choosing which contains all assets. This could be an optional field in the proxied_state files.
The version 1.8
How I Tested These Changes
If we like this direction I'll go back and add tests
Changelog
NOCHANGELOG