dagster icon indicating copy to clipboard operation
dagster copied to clipboard

[dagster-airlift][rfc] Proxy operator launches partitioned runs

Open dpeng817 opened this issue 4 months ago • 1 comments

Summary & Motivation

Adds a pluggable implementation to BaseAssetsOperator which handles mapping the current airflow run to a partitioned run in Dagster. By default, we do the same thing that we do in the sensor - we attempt to map the logical date directly to a partition.

Important points to note:

  • I make the simplifying assumption that all assets within a given task share the same partitions definition. This makes it so that we can keep to the "one run" constraint from a previous PR.
  • There's two points of pluggability that I think make sense to expose. The first is the method get_partition_key(context, partition_keys), which allows users to pick a partition key from the list to use. The second is a pluggable default implementation translate_logical_date_to_partition_key, which takes a list of partition key formats. This is to support TimeWindowPartitionsDefinitions that use a custom format / cron schedule without needing to do a full reimplementation. All they would do is override get_partition_key to call translate_logical_date_to_partition_key with their custom format.

How I Tested These Changes

Added a new test which takes a daily dag and constructs a daily partitioned materialization. Might be worth testing all the other formatting cases, as well as pluggability.

Changelog

NOCHANGELOG

dpeng817 avatar Oct 16 '24 22:10 dpeng817