dagster
dagster copied to clipboard
Make Output metadata available on the OutputContext object
What's the use case?
When you yield an Output from the body of an op, you have the opportunity to add metadata to that event in one of two ways:
yield Output(..., metadata={"some": "metadata"})context.add_output_metadata(...)
Inside the IOManager, you might want to use this runtime metadata to modulate how you process that output. If metadata is supplied the first way, this is essentially impossible, as you don't have access to the Output object (perhaps you could do some gross thing to get it, but I didn't feel like digging into that). If metadata is supplied in the second way, you can do context.step_context.get_output_metadata(context.name) to retrieve it, which works ok, but is a little confusing as nowhere else in dagster does it matter which way the output metadata was supplied.
Ideas of implementation
When building the OutputContext, we could supply a new field "runtime_metadata" (name tbd)
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
Requests
- https://dagster.slack.com/archives/C01U5LFUZJS/p1659619871992709
Related to https://github.com/dagster-io/dagster/issues/8521
Did not even realize the second option here. How does that work in the context of multiple Outputs?
@dmosesson output_name is one of the arguments of add_output_metadata on an op/asset context. We generally recommend the other option though.
We were impacted by this. Our use-case is that we want to report all numerical metadata for each asset to DataDog.
numerical_metadata: dict[str, IntMetadataValue | FloatMetadataValue] = {
k: v
for k, v in context.get_logged_metadata().items()
if isinstance(v, IntMetadataValue) or isinstance(v, FloatMetadataValue)
}
...
for metadata_key, metadata_value in numerical_metadata.items():
statsd.distribution(f"asset.{metadata_key}", metadata_value.value, tags=[f"asset:{asset_name}"])
The hack mentioned above (context.step_context.get_output_metadata(context.name)) unblocks us, but I'm concerned that this will be removed as suggested by the warning here: https://github.com/dagster-io/dagster/blob/ff0fb791845f25f5f25fb43f56a1e68d76684efb/python_modules/dagster/dagster/_core/execution/context/output.py#L343-L348