hudi icon indicating copy to clipboard operation
hudi copied to clipboard

Upserts, Deletes And Incremental Processing on Big Data.

Results 1006 hudi issues
Sort by recently updated
recently updated
newest added

currently, we use now() - splitLatestCommit, however, when the time goes and the task just processes a huge data commit, then the diff between now and splitLatestCommit may get larger....

priority:high
type:bug
from-jira
status:pr-available

https://github.com/apache/hudi/pull/12781/files#r1964205520 A new constructor is added.  We should see if this is really needed (rewrite the tests so this is not needed?) and keep the constructors simple, by removing this...

priority:critical
from-jira
type:devtask

[https://github.com/apache/hudi/pull/12105/files#r1815875535] We need to move [this logic|https://github.com/apache/hudi/blob/a7512a206c5a1e8ce251cac7a302632a57d8c848/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java#L855-L858] inside `HoodieMetadataPayload.combineSecondaryIndexRecord`, and need to override `MetadataPartitionType.combineMetadataPayloads` for secondary index. While updating secondary index, merging logic should follow same logic as readFromBaseAndMergeWithLogRecords API...

priority:critical
from-jira
type:devtask

For CUSTOM merge mode, the list of record merging implementation classes is required for the record merging to work.  Persisting it to the table config makes it easier for query...

priority:critical
from-jira
status:needs-attention
type:devtask

Flush out direction for end to end writes using Row or InternalRow ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-9035 - Type: Sub-task - Parent: https://issues.apache.org/jira/browse/HUDI-9019 - Fix version(s): - 1.1.0

priority:high
from-jira
type:devtask

If the user wants to migrate from using the payload class to the merger implementation class, the merger strategy ID needs to be changed, and other record merge configs need...

priority:critical
from-jira
type:devtask

PAYLOAD_CLASS_NAME ("hoodie.compaction.payload.class") is defined in both HoodiePayloadConfig and HoodieTableConfig.  They are used in different places.  We should keep one of them only to avoid confusion. ## JIRA info - Link:...

priority:critical
from-jira
type:devtask

We need to ensure that we cover the following cases for basic col stats certification: # insert few records validate. update the same and validate updates are reflected. repeat the...

priority:critical
area:metadata-table
from-jira
type:devtask

While working towards making partition stats default, we ran into an issue, with Byte data type  [https://github.com/apache/hudi/pull/12671]   min max values when merging multiple values did not align w/ manually computed...

priority:high
area:metadata-table
from-jira
type:devtask

Related to - https://issues.apache.org/jira/browse/HUDI-8275 Currently, we are not using the new filegroup reader for bootstrap splits. We need to fix that. ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-8380 - Type: Sub-task...

priority:critical
from-jira
type:devtask