datahub icon indicating copy to clipboard operation
datahub copied to clipboard

No data for Min and Max when enable profiling

Open njalan opened this issue 2 years ago • 3 comments

Below is my Configuration for enabling profiling: profiling: enabled: True profile_pattern: allow: - "ods_mos_pvg.xxx"

After ingesting there are statistics for Null Count,Distinct Count but there is no data for Min | Max | Mean | Median

njalan avatar Jun 13 '22 07:06 njalan

for a bug report to be actionable, could you provide more information, like source, datahub version? and verify that the column is created as a numeric type?

xiphl avatar Jun 13 '22 10:06 xiphl

@xiphl Datahub version is 0.8.36. Source is hive table but it is in apache hudi format. There is no column has the statistics for Min | Max | Mean | Median even they are numeric type. is it because they are hudi . Below is my yml file:

source: type: sqlalchemy config: schema_pattern: allow: - test table_pattern: allow: - test.part profiling: enabled: True profile_pattern: allow: - "test.part" platform: presto connect_uri: presto://xxx:xxxx@xxxxx:8443 options: connect_args: host: xxxx port: 8443 catalog: hive schema: test protocol: https requests_kwargs: verify: False domain: "urn:li:domain:hive_profile": allow: - ".*"

sink: type: datahub-rest config: server: 'http://localhost:8080'

njalan avatar Jun 14 '22 02:06 njalan

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Aug 09 '22 02:08 github-actions[bot]

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] avatar Sep 15 '22 06:09 github-actions[bot]