OpenMetadata
OpenMetadata copied to clipboard
Column Profiling metrics based on data type (for UI and calculations)
Is your feature request related to a problem? Please describe. Column profiling metrics should only be shown if they are relevant for that data type
Describe the solution you'd like
For example, UI should not show SUM profiling fo Dates or for strings.
Example from SandBox, where for VARCHAR column, out of 5 graphs, 3 are empty and not relevant, It is confusing to users.
https://sandbox.open-metadata.org/profiler-dashboard/column/sample_data.ecommerce_db.shopify.dim_address.first_name
also, the profiling graphs don't resize if screen is minimized.
calculations should also not happen in these instances
Describe alternatives you've considered nothing
Additional context Add any other context or screenshots about the feature request here.
Proposal:
- Prepare a system-level configuration where users can decide which metrics are interesting for them based on each datatype
- Users will be able to opt in/out of the computation of certain metrics globally
- The UI won't show the graphs of metrics excluded in these settings
- The profiler workflow won't compute the metrics excluded in these settings.
We'll take care of this in 1.3
Tasks backend
- [ ] Add
profilerConfig
endpoint to theconfigResource
to allow mapping ofdtype <-> metrics
to exclude
Ingestion
- [ ] Update profiler to check global
profilerConfig
for metric exclusion
UI
- [ ] Add configuration page in the settings
hello, do we have any plans for this enhancement?
aiming at 1.4, informed on the project
@pmbrull @TeddyCr @ShaileshParmar11 are we planning to finish this in release 1.4.0
re-opening as ui side of changes are still pending