hudi
hudi copied to clipboard
[HUDI-8036] Handle partition schema for custom key gen in SparkHoodieTableFileIndex
Change Logs
Currently the partition schema defined for table in SparkHoodieTableFileIndex does not handle the different partition types for the partition columns. These partition types are simple and timestamp for custom based keygen. The Jira aims to handle these partition types and reproduce the issue as mentioned in mentioned in https://github.com/apache/hudi/issues/8343.
Changes in the PR - Have a separate file index used for HoodieBaseRelation and snapshot, incremental etc. queries. This file index would use string type as the schema for timestamp partition columns. The logical plan for insert into, merge into and update table commands has to be changed now to replace the reader file index and use the original file index so that table schema does not change.
Impact
NA
Risk level (write none, low medium or high below)
low
Documentation Update
NA
Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
CI report:
- 3f7cca22feffc4f468eb6e17d6fdabafc0b595c5 Azure: SUCCESS
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build