iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Retain table statistics during orphan files removal

Open findepi opened this issue 3 years ago • 1 comments

Do not delete table statistics files when running remove_orphan_files.

Extracted from https://github.com/apache/iceberg/pull/4741 and based on that PR, and also on https://github.com/apache/iceberg/pull/5794 and https://github.com/apache/iceberg/pull/5799.

findepi avatar Sep 19 '22 12:09 findepi

rebased after https://github.com/apache/iceberg/pull/5799 merged, no other changes

currently, depends on https://github.com/apache/iceberg/pull/5794

findepi avatar Sep 21 '22 10:09 findepi

Rebased after https://github.com/apache/iceberg/pull/5794 is merged.

@rdblue please take a look

findepi avatar Sep 27 '22 11:09 findepi

This looks fine, but it requires the public API to return statisticsFiles from Table so we should get that one in first.

please see my response: https://github.com/apache/iceberg/pull/5795#discussion_r983265802

findepi avatar Sep 29 '22 08:09 findepi

Applied or responded to comments. I didn't do anything with the public Table API yet. I think, however, it shouldn't be a blocker, since the affected ReachableFileUtil class already depends on non-API information in other methods. We can improve ReachableFileUtil after https://github.com/apache/iceberg/pull/4741, or we can land https://github.com/apache/iceberg/pull/4741 first and improve here.

@rdblue please take another look.

findepi avatar Sep 29 '22 09:09 findepi

(just rebased after https://github.com/apache/iceberg/pull/4741 merged, no changes yet)

findepi avatar Sep 30 '22 09:09 findepi

This looks fine, but it requires the public API to return statisticsFiles from Table so we should get that one in first.

Done now.

@rdblue please take another look

findepi avatar Sep 30 '22 09:09 findepi

Thanks, @findepi! I merged this.

rdblue avatar Sep 30 '22 15:09 rdblue

thank you @rdblue for the merge!

findepi avatar Sep 30 '22 18:09 findepi