dagster icon indicating copy to clipboard operation
dagster copied to clipboard

Make Output metadata available on the OutputContext object

Open OwenKephart opened this issue 3 years ago • 1 comments

What's the use case?

When you yield an Output from the body of an op, you have the opportunity to add metadata to that event in one of two ways:

  1. yield Output(..., metadata={"some": "metadata"})
  2. context.add_output_metadata(...)

Inside the IOManager, you might want to use this runtime metadata to modulate how you process that output. If metadata is supplied the first way, this is essentially impossible, as you don't have access to the Output object (perhaps you could do some gross thing to get it, but I didn't feel like digging into that). If metadata is supplied in the second way, you can do context.step_context.get_output_metadata(context.name) to retrieve it, which works ok, but is a little confusing as nowhere else in dagster does it matter which way the output metadata was supplied.

Ideas of implementation

When building the OutputContext, we could supply a new field "runtime_metadata" (name tbd)

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

Requests

  • https://dagster.slack.com/archives/C01U5LFUZJS/p1659619871992709

OwenKephart avatar Jul 28 '22 16:07 OwenKephart

Related to https://github.com/dagster-io/dagster/issues/8521

sryza avatar Jul 28 '22 16:07 sryza

Did not even realize the second option here. How does that work in the context of multiple Outputs?

dmosesson avatar Dec 27 '22 16:12 dmosesson

@dmosesson output_name is one of the arguments of add_output_metadata on an op/asset context. We generally recommend the other option though.

sryza avatar Dec 27 '22 17:12 sryza

We were impacted by this. Our use-case is that we want to report all numerical metadata for each asset to DataDog.

        numerical_metadata: dict[str, IntMetadataValue | FloatMetadataValue] = {
            k: v
            for k, v in context.get_logged_metadata().items()
            if isinstance(v, IntMetadataValue) or isinstance(v, FloatMetadataValue)
        }

        ...

        for metadata_key, metadata_value in numerical_metadata.items():
            statsd.distribution(f"asset.{metadata_key}", metadata_value.value, tags=[f"asset:{asset_name}"])

The hack mentioned above (context.step_context.get_output_metadata(context.name)) unblocks us, but I'm concerned that this will be removed as suggested by the warning here: https://github.com/dagster-io/dagster/blob/ff0fb791845f25f5f25fb43f56a1e68d76684efb/python_modules/dagster/dagster/_core/execution/context/output.py#L343-L348

FlightOfStairs avatar Dec 21 '23 17:12 FlightOfStairs