amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[AMORO-2861]: Support display Hudi table metadata in Amoro Dashboard

Open baiyangtx opened this issue 1 year ago • 2 comments

Why are the changes needed?

Close #2861.

Brief change log

  • Add hudi table catalog support
  • Implement hudi format catalog for Hadoop metastore.
  • Implement hudi format catalog for Hive metastore

How was this patch tested?

  • [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • [ ] Add screenshots for manual tests if appropriate

  • [ ] Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

baiyangtx avatar May 29 '24 08:05 baiyangtx

Support choose Hudi format when create hive/hadoop catalog.

image

Support display Hudi tables in catalog, with hudi icon.

image

Table details

Support display Hudi table details, include table summary, scheme, metrics, table type and properties image

Support display partitions and files in partitions.

Due to limit of hudi api, some field without values. image image

Support display timeline as snapshots

image image Due to limit of hudi api, records statistic is missed, and file count is incorrect for deltacommit.

Support display compact/cluster instant as optimizing process

image image

Only show completed instant, and show compact as minor optimize, cluster as major optimize.

DDL

Due to limit of hudi api, DDL history is empty.

baiyangtx avatar Jun 21 '24 02:06 baiyangtx

Some check style errors exist, which you may want to fix. @baiyangtx

zhoujinsong avatar Jun 25 '24 11:06 zhoujinsong

cc @majin1102 @zhoujinsong

czy006 avatar Aug 13 '24 10:08 czy006

when use getTableSnaphots, some exception occurs.

Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'Objavro': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false') as [Source: (String)"Obj\u0001\u0002\u00016avro.schema\u0015{"type":"record","name":"HoodieCleanMetadata", ......}]

yhf20071 avatar May 08 '25 02:05 yhf20071

when use snapshot detail ,the exception is Getting all partition paths with file system listing sequentially can be very slow. This should not be invoked.

yhf20071 avatar May 08 '25 03:05 yhf20071

when use snapshot detail ,the exception is Getting all partition paths with file system listing sequentially can be very slow. This should not be invoked.

yhf20071 avatar May 08 '25 03:05 yhf20071

@yhf20071 Thanks for the feedback.

Can you create GH issues to help the community track these bugs?

zhoujinsong avatar May 08 '25 03:05 zhoujinsong