dagster
dagster copied to clipboard
Table IO Managers should capture column schemas with appropriate metadata tag
What's the use case?
Many of the IO managers will add dataframe_columns
metadata to an asset materialization, eg https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-duckdb-pandas/dagster_duckdb_pandas/duckdb_pandas_type_handler.py#L66
In https://github.com/dagster-io/dagster/pull/20424, we standardized on using dagster/column_schema
as the metadata key name for this type of information, and that metadata is now used in the UI and asset checks, eg:
https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/definitions/asset_check_factories/schema_change_checks.py#L58
We should consider updating the metadata key the IO managers create / use. Though this would be a breaking change.
Ideas of implementation
No response
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.