iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Core, API: Support scanning from refs

Open amogh-jahagirdar opened this issue 2 years ago • 8 comments

This change adds support for scanning from refs, where the scan can either be a time travel from a tag or a time travel from the tip of a branch.

amogh-jahagirdar avatar Jul 27 '22 01:07 amogh-jahagirdar

@namrathamyske @rdblue @jackye1995 I've updated the PR based on our discussion on this thread . Let me know what you think! I still need to add tests.

amogh-jahagirdar avatar Aug 23 '22 17:08 amogh-jahagirdar

Overall looks good to me, can we add some tests?

jackye1995 avatar Aug 23 '22 17:08 jackye1995

@jackye1995 Definitely, I'm still working on adding tests, wanted to get a discussion on the code in parallel.

amogh-jahagirdar avatar Aug 23 '22 17:08 amogh-jahagirdar

Java CI build check and revapi check both fail with:

  • What went wrong: Plugin [id: 'nebula.dependency-recommender', version: '11.0.0'] was not found in any of the following sources:

https://plugins.gradle.org/ is down for me (get a 1016 error from cloudflare) so most likely can't resolve any plugins. I'll retrigger the build when it becomes available again.

amogh-jahagirdar avatar Aug 24 '22 00:08 amogh-jahagirdar

Updated with more tests @namrathamyske @jackye1995 @hililiwei @rdblue let me know what you think!

amogh-jahagirdar avatar Aug 24 '22 16:08 amogh-jahagirdar

I should have time to look today. Thanks, @amogh-jahagirdar!

rdblue avatar Aug 24 '22 18:08 rdblue

@rdblue I considered this when I changed the implementation to use snapshot timestamps instead of History entries. To me the timestamps on the history entries and snapshot log are 1:1. Just for my understanding, what are the cases where the timestamp on the history entry as a result of a new snapshot, is different than the timestamp on the snapshot itself? It seems like they would always be the same.

Edit: Okay I see in the code here: SnapshotLogEntry's timestamp would be different, if it wasn't considered as an "added" snapshot. But I'm still not sure what case that is tbh.

amogh-jahagirdar avatar Aug 24 '22 21:08 amogh-jahagirdar

Discussed offline with @rdblue , I got an understanding of how history entries are different than the snapshot ancestor lineage, posting the information here so folks referring to the PR have context:

"History is the state of main at any given time. That can be changed, much like master in git can be changed. Ancestors can't be changed."

The current main can change when rollback is performed. The snapshot which gets resolved for a given time T should remain constant regardless of what is the current main ancestor lineage. History entries capture all the changes on the main table state and are what enable this ability to time travel to a snapshot which are outside the current table ancestors which is ultimately desired behavior. It also doesn't make sense to treat main as a separate case, and have a different semantic for non-main branches when it comes to time travel.

Maybe we could maintain separate history metadata for branches, but most likely it would need to be opt-in by users for specific branches and would need to not add too much metadata to manage.

So for now, it makes sense just to do time travel to a branch tip or time travel to a given tag. Updated the PR @rdblue @namrathamyske @jackye1995

amogh-jahagirdar avatar Aug 26 '22 01:08 amogh-jahagirdar

Looks great. Thanks, @amogh-jahagirdar! I'll merge when tests are passing. I had to fix a conflict in revapi.

rdblue avatar Oct 02 '22 23:10 rdblue

Thanks @rdblue @yzhaoqin @namrathamyske @hililiwei for the review!

amogh-jahagirdar avatar Oct 03 '22 17:10 amogh-jahagirdar