Retain table statistics during orphan files removal
Do not delete table statistics files when running remove_orphan_files.
Extracted from https://github.com/apache/iceberg/pull/4741 and based on that PR, and also on https://github.com/apache/iceberg/pull/5794 and https://github.com/apache/iceberg/pull/5799.
rebased after https://github.com/apache/iceberg/pull/5799 merged, no other changes
currently, depends on https://github.com/apache/iceberg/pull/5794
Rebased after https://github.com/apache/iceberg/pull/5794 is merged.
@rdblue please take a look
This looks fine, but it requires the public API to return statisticsFiles from Table so we should get that one in first.
please see my response: https://github.com/apache/iceberg/pull/5795#discussion_r983265802
Applied or responded to comments.
I didn't do anything with the public Table API yet.
I think, however, it shouldn't be a blocker, since the affected ReachableFileUtil class already depends on non-API information in other methods. We can improve ReachableFileUtil after https://github.com/apache/iceberg/pull/4741, or we can land https://github.com/apache/iceberg/pull/4741 first and improve here.
@rdblue please take another look.
(just rebased after https://github.com/apache/iceberg/pull/4741 merged, no changes yet)
This looks fine, but it requires the public API to return statisticsFiles from Table so we should get that one in first.
Done now.
@rdblue please take another look
Thanks, @findepi! I merged this.
thank you @rdblue for the merge!