hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-8126] Use union to parallelize data and error table writes

Open kroushan-nit opened this issue 1 year ago • 1 comments

Change Logs

Enable writing of error and data table in parallel. This behavior is disabled by default and can enabled by setting error table config property: hoodie.errortable.write.union.enable to true

Impact

Improved overall write performance and commit times when error table writer is enabled

Risk level (write none, low medium or high below)

None

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the instruction to make changes to the website.

A new config (hoodie.errortable.write.union.enable) is being added to HoodieErrorTableConfig as part of this PR

Contributor's checklist

  • [x] Read through contributor's guide
  • [x] Change Logs and Impact were stated clearly
  • [x] Adequate tests were added if applicable
  • [x] CI passed

kroushan-nit avatar Aug 27 '24 18:08 kroushan-nit

CI report:

  • 3c0807a22c8b2722ae863e86da1c8d62a91da167 Azure: SUCCESS
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Sep 08 '24 14:09 hudi-bot

Closing as it is duplicate of https://github.com/apache/hudi/pull/12813

kroushan-nit avatar Feb 18 '25 04:02 kroushan-nit