hudi-rs
hudi-rs copied to clipboard
Support CoW incremental query
From the official docs, there are two ways to implement incremental queries.
- Configuration passed by options, details Spark Incremental Query For Hudi-0.13.0
- Through the
hudi_table_changesTVF, details Spark Incremental Query For Hudi-0.14.1
Which method do you suggest using?
@xushiyan Hullo, I would like to work on this. The high level implementation would be to:
- Use timeline to retrieve latest commit or specific commit to use as a checkpoint
- Get metadata for commits
- Query on the changed data from that last check point. If there is any more specifics let me know!
thank you both for the interest! we will do table api support first for incremental query, and then move on to sql support using datafusion. i'll lay out some groundwork first before splitting more follow up tasks.
@xushiyan Is there anything I can help with for oroviding table api support?