hudi
hudi copied to clipboard
[HUDI-9482] Fix the issue where the file slice was wrongly filtered out
Please consider following case:
/**
* case: query with pending instant
*
* 1. insert-txn1 start, will create a log file
* 2. insert-txn2 start, will create a log file
* 3. insert-txn2 commit
* 4. query-txn3 execute
* 5. insert-txn1 commit
*
* |-------------------------------- txn1: insert --------------------------------------|
* |------ txn2: insert ------|
* |---txn3: query---|
*
* we expected txn3's query should see the data of txn2
*/
However, I found that the reading result here was empty and the data written by txn2 was not read as I expected.
The reason lies in that when we were constructing the file slice at the current moment, we selected the smallest instant as the base instant of this file slice. In the above case, that is, txn1 was used as the base instant.
When we try to query, we will perform
AbstractTableFileSystemView#getLatestMergedFileSlicesBeforeOrOn to fetch the file slice to read.
Eventually, we will go to the above-mentioned judgment logic to determine whether a slice is committed. The logic here is to determine whether there is the base instant of this slice on the timeline. However, according to the above logic, the base instant of this slice is txn1. This instant does not exist on the timeline because the timeline here is composed of the completed writes and the pending compaction. Naturally, it does not include the txn1 that is still pending. This led to the entire slice being filtered out. However, we hope that the log files written by txn1 can be filtered and the log files written by txn2 can be read.
Change Logs
- Fix the issue where the file slice was wrongly filtered out when there were committed and pending logs
Describe context and summary for this change. Highlight if any code was copied.
Impact
bug fix Describe any public API or user-facing feature change or any performance impact.
Risk level (write none, low medium or high below)
mid If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
none Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
- The config description must be updated if new configs are added or the default value of the configs are changed
- Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the instruction to make changes to the website.
Contributor's checklist
- [x] Read through contributor's guide
- [x] Change Logs and Impact were stated clearly
- [x] Adequate tests were added if applicable
- [x] CI passed
The essence of this problem is: How can we confirm for a file group that only has log files what the base instant of its slice is? The current logic is to directly take the instant with the smallest character order as the base instant. But in this way, the problems I mentioned above will arise. I think we can pretend that each file group initially has an HoodieActiveTimeline.INIT_INSTANT_TS version, but this version has no actual data written. In this way, we don't have to consider how to handle slices that are all logs.
My current code changes are based on the above logic. I'm not sure if there are any other better methods. Looking forward to everyone's thoughts
@hudi-bot run azure
@hudi-bot run azure
CI report:
Bot commands
@hudi-bot supports the following commands:@hudi-bot run azurere-run the last Azure build
@danny0405 Hi, Danny, this is a bug in data quality that has already occurred in our production environment. Let's solve this problem together!
@danny0405 Hi, Danny, any ideas?