hudi
hudi copied to clipboard
Upserts, Deletes And Incremental Processing on Big Data.
### Change Logs [HUDI-9332] Pluggable Table Format Support with native Integration 1. Includes base interface for pluggable table format 2. Includes native hudi integration for pluggable table format 3. Includes...
### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance impact._...
### Change Logs Test three cases: 1. Different base file schema; 2. Different log file schema; 3. Different base and log file schema. With extra primitive column and extra nested...
### Change Logs - Removes unused codepaths ### Impact - Clean up the code ### Risk level (write none, low medium or high below) None ### Documentation Update _Describe any...
### Change Logs - The bloom index uses the `HoodieKeyLookupHandle` which looks up the latest base file for the provided file group ID and partition path. This runs the risk...
### Change Logs Spark SQL UPDATE and DELETE do not write record positions to the log files. The PR aims to add the metadata in log files so that it...
### Change Logs Add in memory buffer sort in append write function to improve the parquet compression ratio. From our experiment and testing, It can improve 300% compression ratio with...
### Change Logs To be filled in. ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write none, low medium or high...
> [!NOTE] > I am experiencing an issue related to parquet gzip decoding. Table is in cow format, the issue occurs after hundreds of commits, which is repeatable. compress codec...
### Change Logs - Updates code to avoid creating empty files when there is content to write ### Impact - Avoid risk of creating an empty file due to a...