dagster icon indicating copy to clipboard operation
dagster copied to clipboard

Table IO Managers should capture column schemas with appropriate metadata tag

Open slopp opened this issue 9 months ago • 3 comments

What's the use case?

Many of the IO managers will add dataframe_columns metadata to an asset materialization, eg https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-duckdb-pandas/dagster_duckdb_pandas/duckdb_pandas_type_handler.py#L66

In https://github.com/dagster-io/dagster/pull/20424, we standardized on using dagster/column_schema as the metadata key name for this type of information, and that metadata is now used in the UI and asset checks, eg:

https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/definitions/asset_check_factories/schema_change_checks.py#L58

We should consider updating the metadata key the IO managers create / use. Though this would be a breaking change.

Ideas of implementation

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

slopp avatar May 17 '24 21:05 slopp