iceberg-python
iceberg-python copied to clipboard
Incremental Append Scan
Hi,@Fokko, long time no see. 😄 I have written some preliminary code for incremental reading, but it still needs a lot of work. However, I would like to discuss it with you at an early stage as it will help me stay on the right track. Could you please take a look at it when you have a chance? Thank you.
In the latest code commit, I tinkered with the class inheritance by introducing a new base class, BaseIncrementalScan, which inherits from TableScan. I also pushed the snapshot_id down to DataScan and shuffled a few methods around (which might cause some backward compatibility issues 💔 ). How do you think I can improve it? @Fokko
Sorry for the late correction. I've adjusted the code based on the latest comments. Could you please take a look?
@hililiwei I'm sorry, this also fell off my radar.
I managed to get a poor mans append-scan with this https://github.com/apache/iceberg-python/issues/240#issuecomment-2248323987
Looking at this PR wouldn't it be simpler to implement append-scan in the api by adding a append_scan method to Table, then refactoring plan_files to take an optional snapshot_id, and providing a lightweight AppendScan class that makes 2 calls to plan_files and then compares?
In my case there was no need for touching __eq__ or __hash__