hudi
hudi copied to clipboard
[HUDI-7431] Add replication and block size to StoragePathInfo to be backwards compatible
Change Logs
This PR adds the replication and block size information to StoragePathInfo so that it is backward compatible for generating FileStatus from StoragePathInfo and Hive's FileInputFormat to properly generate splits based on the block size. Hive's relevant logic is mentioned below. Without this change, the replication and block size information are dropped; Hive's input format generates a huge number of splits with size 1, without block size (0), causing performance regression.
This fixes the test issue in the integration of HoodieStorage abstraction, which can be found in #10591.
Impact
Fixes backward compatibility in HoodieStorage abstraction.
Risk level
low
Documentation Update
N/A
Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
CI report:
- 1ccf94bbcef53d3c4b3d14a3953b432f698e52d3 Azure: SUCCESS
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build