iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Support getting a snapshot right before the given timestamp

Open chinmay-bhat opened this issue 9 months ago • 2 comments

Bring support to retrieve a snapshot before a particular timestamp, which is needed to perform Spark procedure like rollback_to_timestamp.

See comment in issue

chinmay-bhat avatar May 16 '24 13:05 chinmay-bhat

Hello @chinmay-bhat,

I noticed that you are implementing the ancestors_of method, and we have another pull request (#533) that is implementing the same behavior in another place as a function with a different output (Iterable[Snapshot] instead of a List[tuple[int, int]]) and signature (expects a Snapshot instead of a Snapshot ID).

I believe that we need to discuss and choose which one we want to have in the codebase.

cc/ @HonahX @Fokko @syun64

ndrluis avatar May 16 '24 18:05 ndrluis

Hi @ndrluis thank you for flagging this! That PR went under my radar, and I'm excited to see a incremental scanning feature being implemented already on PyIceberg.

As for the question on the output type, I'm +1 for using Iterable[Snapshot] because I have a preference for using a class with set attributes than using a tuple.

Im also +1 for introducing the feature in this separate PR, since it's a much simpler feature in itself we can introduce quickly. WDYT?

sungwy avatar May 16 '24 19:05 sungwy

Happy to update the output type to Iterable[Snapshot]! Also I really like how concise the ancestors_of function is in the other PR.

chinmay-bhat avatar May 17 '24 08:05 chinmay-bhat

Thank you for the review @Fokko, @HonahX, @syun64 and @ndrluis ! 🚀

chinmay-bhat avatar Jun 03 '24 16:06 chinmay-bhat

Merged! Thanks @chinmay-bhat for the great work! Thanks @Fokko @syun64 @ndrluis for the review and discussions!

HonahX avatar Jun 03 '24 16:06 HonahX

Congrats on your first PR @chinmay-bhat !

sungwy avatar Jun 09 '24 13:06 sungwy