Profiler support for Complex Data Types (json, arrays, geo...)
Is your feature request related to a problem? Please describe. OM currently manages a set of data types for which the profiler does not compute any metrics. This is inconvinent becuase there are metrics which are useful in the context of complex data types (null counts, size for data structures, uniqueness for geo).
Describe the solution you'd like
1. Handle nullCount for all data types.
2. Add 3 more groups of metrics:
- Collections (arrays, lists)
- Structs (json, etc...)
- Complex (Geo)
Handle each of these groups with specific sets of metrics to compute.
Describe alternatives you've considered
None...
Additional context
This is an example of profiler result for a Redshift table with complex data type columns. We can observe no metrics are collected for the SUPER, GEOMETRY and GEOGRAPHY columns.
hey @sushi30 I want to contribute to this, from where I can start. I am new.
@samarth-jain28 please connect on our slack and post a message in #contributor. That will be a more appropriate place to handle the discussion.
Hi @sushi30 , I noticed there's been no recent activity on this issue. If you're not working on it, could it be reassigned to me? I'd be happy to help.
Thanks!
May I be assigned this issue?
@tristanhendry are you planning on contributing to this issue?
No, I apologize for the inactivity and thank you for reaching out.
From: Sriharsha Chintalapani @.> Sent: Thursday, December 12, 2024 2:56 PM To: open-metadata/OpenMetadata @.> Cc: Hendry, Tristan R. @.>; Mention @.> Subject: Re: [open-metadata/OpenMetadata] Profiler support for Complex Data Types (json, arrays, geo...) (Issue #15627)
[External Email]
@tristanhendryhttps://github.com/tristanhendry are you planning on contributing to this issue?
— Reply to this email directly, view it on GitHubhttps://github.com/open-metadata/OpenMetadata/issues/15627#issuecomment-2539891672, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BDGSGUE2J7F67FXRBZPS6QD2FHS6VAVCNFSM6AAAAABE7IA4W2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZZHA4TCNRXGI. You are receiving this because you were mentioned.Message ID: @.***>