hudi
hudi copied to clipboard
[HUDI-5973] Fixing refreshing of schemas in HoodieStreamer continuous mode
Change Logs
Fixing refreshing of schemas in HoodieStreamer continuous mode.
Impact
Fixing refreshing of schemas in HoodieStreamer continuous mode. With FileBased schema provide one has to shutdown and restart deltstreamer if schema is changed. So we are fixing this as part of this patch. Wrt Schema registry based schema provider, everytime we call getSourceSchema, we make remote calls and fetch real time schema. If a pipeline has N transformers, chances are that each transformer could be operating w/ a diff schema. So, with caching and explicit refresh calls, we are ensuring for an entire write, only one source and target schema will be in play.
Risk level (write none, low medium or high below)
low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
- The config description must be updated if new configs are added or the default value of the configs are changed
- Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the instruction to make changes to the website.
Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
CC @danielfordfc @rohitmittapalli
CI report:
- 21b7f2c89745631fc854f11ca201080762728105 UNKNOWN
- 6af3b35ecc9da78e79484800c860c35e529ee0d5 Azure: FAILURE
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build