hudi
hudi copied to clipboard
[HUDI-9405] Streaming writes to mdt end to end
Change Logs
This is part of a larger feature support where we are adding streaming write support to metadata table. We are going to raise lot of smaller patches as part of https://issues.apache.org/jira/browse/HUDI-9281. Goal is to keep producing small patches that can land in isolation w/o being blocked on very large patch. for overall design, feel free to check out https://github.com/apache/hudi/pull/13236
Prior to reviewing this patch, do review these patches: https://github.com/apache/hudi/pull/13290 https://github.com/apache/hudi/pull/13292 https://github.com/apache/hudi/pull/13295 https://github.com/apache/hudi/pull/13286 https://github.com/apache/hudi/pull/13305 https://github.com/apache/hudi/pull/13307
Commits of interest to review in this patch: https://github.com/apache/hudi/pull/13312/files/c1752e425ae75758ca1c74cb9a3d20faa1880902..06e5878c03264d9daed9390e9299acff58bb3e55
We are adding a new abstraction named HoodieMetadataWriterWrapper which is the interface for SparkRDDwriteClient and SparkRDDTableServiceClient. Irrespective of whether streaming writes are enabled or not, we keep all intricacies within the HoodieMetadataWriterWrapper(like the life cycle management of diff metadata writers corresponding to each action in data table in case of streaming writes). We just don't wanna pollute the core RDDwriteClient or the the TableServiceClient w/ the streaming writer state management and interactions.
Impact
Able to do streaming writes to metadata table directly w/ rdd based off of data table writes.
Risk level (write none, low medium or high below)
high
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
- The config description must be updated if new configs are added or the default value of the configs are changed
- Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the instruction to make changes to the website.
Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
CI report:
- 06e5878c03264d9daed9390e9299acff58bb3e55 Azure: FAILURE
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build