dagster icon indicating copy to clipboard operation
dagster copied to clipboard

[DS][29/n] Create ParentNewerCondition

Open OwenKephart opened this issue 9 months ago • 1 comments

Summary & Motivation

Originally, this was conceived as a condition that would be used as part of an any_deps_match() sort of context. There are a few problems with this:

  1. All other conditions are simply true or false for a given asset partition, regardless of which child_partition we consider to be the "downstream". e.g. an asset partition is unambiguously missing or not missing. However, if you have A -> B with A unpartitioned and B partitioned, then it's possible for A to be newer than some partitions of B but older than others. This means that if we wanted to evaluate something like: any_deps_match(missing() | newer_than_child()), then we would have to evaluate the inner expression of missing | newer_than_child on a per-child-partition basis, which is both expensive and extremely challenging to visualize
  2. We want to eventually implement this via the StaleStatusResolver, which fits way better with the mental model of "a child partition can have the status (parent is newer)" rather than "a child partition can have the status (any parent has the status (newer than child partition))"

The downside is that this does not compose as nicely with other conditions, but I think this is actually ok. For example, we don't support "all parents newer" out of the box, but I think this is a good thing, as that sort of behavior is actually a bit of a footgun, which leads to weird synchronization issues if you (e.g.) materialize one of the two parents, then manually materialize the child. Now you're on a weird new cadence in which your child will not materialize after the second parent updates.

For most cases in which you want to wait for more than one parent to update, you're either filling in time partitions as they come in (and so you can just wait for no parent partitions to be missing, rather than all parent partitions to update), or you want to materialize an unpartitioned asset regularly (in which you can wait for all parent partitions to be updated since a given cron tick, rather than waiting for them all to be updated more recently than you have updated).

How I Tested These Changes

OwenKephart avatar May 03 '24 18:05 OwenKephart

  • #21641 Graphite: 2 dependent PRs (#21670 Graphite, #21671 Graphite) 👈
  • #21640 Graphite
  • #21648 Graphite
  • #21615 Graphite
  • #21613 Graphite
  • #21612 Graphite
  • #21573 Graphite
  • #21546 Graphite
  • #21545 Graphite
  • #21541 Graphite
  • #21540 Graphite
  • #21539 Graphite
  • #21538 Graphite
  • #21537 Graphite
  • #21536 Graphite
  • #21535 Graphite
  • #21521 Graphite
  • #21520 Graphite
  • #21511 Graphite: 1 other dependent PR (#21512 Graphite)
  • #21510 Graphite
  • #21508 Graphite
  • #21507 Graphite
  • #21505 Graphite
  • #21504 Graphite
  • #21503 Graphite
  • #21502 Graphite
  • #21501 Graphite
  • #21500 Graphite
  • #21499 Graphite
  • master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @OwenKephart and the rest of your teammates on Graphite Graphite

OwenKephart avatar May 03 '24 18:05 OwenKephart

All caps too aggressive aesthetically imo

On Tue, May 14 2024 at 9:43 AM, OwenKephart < @.*** > wrote:

@.**** commented on this pull request.

In python_modules/dagster/dagster/_core/asset_graph_view/asset_graph_view.py ( https://github.com/dagster-io/dagster/pull/21641#discussion_r1600069555 ) :

  • def compute_asset_partitions(self) -> AbstractSet[AssetKeyPartitionKey]:
  •    return
    

self._compatible_subset.asset_partitions +

Ok I updated the name to compute_asset_partitions_EXPENSIVE , which I think is a sufficient deterrent for now, and we can move this discussion to a future diff

— Reply to this email directly, view it on GitHub ( https://github.com/dagster-io/dagster/pull/21641#discussion_r1600069555 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AG3IK6KLIK7SC35FNPNQAY3ZCIIH7AVCNFSM6AAAAABHF7TBCOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDANJVGQZDIMJXGY ). You are receiving this because your review was requested. Message ID: <dagster-io/dagster/pull/21641/review/2055424176 @ github. com>

schrockn avatar May 14 '24 13:05 schrockn

Merge activity

  • May 14, 10:38 AM EDT: @OwenKephart started a stack merge that includes this pull request via Graphite.
  • May 14, 10:47 AM EDT: Graphite rebased this pull request as part of a merge.
  • May 14, 10:49 AM EDT: Graphite rebased this pull request as part of a merge.
  • May 14, 10:50 AM EDT: @OwenKephart merged this pull request with Graphite.

OwenKephart avatar May 14 '24 14:05 OwenKephart