hudi
hudi copied to clipboard
[HUDI-8126] Use union to parallelize data and error table writes
Change Logs
Enable writing of error and data table in parallel. This behavior is disabled by default and can enabled by setting error table config property: hoodie.errortable.write.union.enable to true
Impact
Improved overall write performance and commit times when error table writer is enabled
Risk level (write none, low medium or high below)
None
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
- The config description must be updated if new configs are added or the default value of the configs are changed
- Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the instruction to make changes to the website.
A new config (hoodie.errortable.write.union.enable) is being added to HoodieErrorTableConfig as part of this PR
Contributor's checklist
- [x] Read through contributor's guide
- [x] Change Logs and Impact were stated clearly
- [x] Adequate tests were added if applicable
- [x] CI passed
CI report:
- 3c0807a22c8b2722ae863e86da1c8d62a91da167 Azure: SUCCESS
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build
Closing as it is duplicate of https://github.com/apache/hudi/pull/12813