hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-7696] Refactoring functions in HoodieTableMetadataUtil

Open bibhu107 opened this issue 1 year ago • 2 comments

Change Logs

Two functions in HoodieTableMetadataUtil, convertFilesToPartitionStatsRecords and convertMetadataToPartitionStatsRecords, are being evaluated for unification. These functions perform similar tasks of creating statistics for records in a partition. The proposal aims to combine them into a single function, improving code readability and maintainability.

Reference: https://github.com/apache/hudi/pull/10352#discussion_r1584149612

Impact

This change will primarily affect the internal implementation of HoodieTableMetadataUtil. While it's not expected to impact the public API directly.

There are simillar other functions like mentioned below :

  • convertFilesToColumnStatsRecords and convertMetadataToColumnStatsRecords
  • convertFilesToBloomFilterRecords and convertMetadataToBloomFilterRecords

Unification challenge: To merge these function pairs, we would need to transform HoodieCommitMetadata into a format compatible with the unified function. This transformation process could potentially introduce inefficiencies.

Risk level

Low

Documentation Update

N/A

As this is an internal code refactoring, no user-facing documentation updates are required.

Contributor's checklist

  • [ ] Read through contributor's guide
  • [ ] Change Logs and Impact were stated clearly
  • [ ] Adequate tests were added if applicable
  • [ ] CI passed

cc - @codope

bibhu107 avatar Aug 25 '24 16:08 bibhu107

Makes sense @codope. Have made necessary changes.

bibhu107 avatar Aug 28 '24 17:08 bibhu107

CI report:

  • 6d3f0f7831f76c09776ada2d03288e1d1d5e140f Azure: SUCCESS
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Aug 28 '24 18:08 hudi-bot