dagster
dagster copied to clipboard
Asset with BackPolicy.single_run() and multipartition will sometimes create multiple job runs
Dagster version
1.8.3
What's the issue?
Using this example code:
partitions_def = MultiPartitionsDefinition(
{
"date": DailyPartitionsDefinition(start_date="2024-08-20"),
"static": StaticPartitionsDefinition(
["partition1", "partition2", "partition3"]
),
}
)
@asset(
partitions_def=partitions_def,
backfill_policy=BackfillPolicy.single_run(),
metadata={"partition_expr": "mmm"},
io_manager_key="duckdb_io_manager",
)
def test_multi_part_asset(context):
context.log.info("hi")
(Ignore the IO Manager - was just using one that outputs log)
I am trying to do a backfill under a single run. I noticed something odd though. If I do a partition that is:
2 date x 2 static
It will spawn 2 job runs, one for each date partition and within those containing each static partition.
I though there may be some pattern to this, as in maybe it only goes across one dimension as a single run - however if I do
3 date x 3 static or 3 date x 4 static or a few other combinations, they are all a single run.
I am not sure why or how this is happening - is there some logic I am missing or is it an error?
What did you expect to happen?
Any number or combination of partitions to run under a single job run.
How to reproduce?
Use code in original explanation and add it as an asset. Materialize the asset on the UI and choose the latest 2 date and 2 static categories.
Deployment type
Local
Deployment details
Just testing locally with the defaults in 1.8.3
Additional information
I've noticed recently there has been some fixes to backfills and the ability to do a single run from a job now - I am not doing this from a job and haven't tested and am using the latest dagster version so I can make sure to use all the newest fixes.
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.