trino
trino copied to clipboard
Fix iceberg $files metadata table not show delete files
Description
This PR is aimed to fix $files
table not showing delete files for iceberg v2 format. https://github.com/trinodb/trino/issues/16233
Additional context and related issues
Release notes
( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. (x) Release notes are required, with the following suggested text:
# Section
* Fix `$files` table not showing delete files for iceberg v2 format. ({issue}`[16233]`)
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
fyi - I have already signed CLA, and I think it might be pending on the process at this moment.
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thanks @ebyhr for the review. I have updated the PR according to the comment. Please take a look when you have some time.
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
@krvikash Thanks for the review. I just updated the PR accordingly. Please take a look when you have some time.
@cla-bot check
The cla-bot has been summoned, and re-checked this pull request!
Hi @0xffmeta, I'm adding a test to validate the $files
system table output and fix the problem raised in https://github.com/trinodb/trino/issues/16473. To do so, I'll cherry-pick your 1st commit Wrap collection values in array blocks
. I hope you don't mind.
@krvikash Thanks for the reminder. No problem at all.
@0xffmeta FYI https://github.com/trinodb/trino/pull/16519/files
Hi @0xffmeta, https://github.com/trinodb/trino/pull/16519 is merged now. Could you please rebase and resolved the conflicts?
@krvikash I just updated this PR. Please take a look when you have some time.
Hi @krvikash, just want to check if this PR can be merged or not.
Hi @krvikash, are you able to review this PR again to see if this can be merged? Thanks.
@0xffmeta, Sorry for the late response. Overall LGTM.
@alexjo2144 Could you please take a look?
@ebyhr @krvikash @alexjo2144 ping. is there any additional help or input needed for this fix to be merged?
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
:wave: @0xffmeta - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers and maintainers to help.
cc @bitsondatadev @findepi @alexjo2144 @findinpath
We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
This is still relevant. Iceberg$files table assigns content=0 for deleted files.
@alexjo2144 @krvikash This is important to fix.
One thing I've observed people trying to do is to build some way to identify whether their table needs to be optimize
d to remove delete files for example.
A simple logic is to see if the size of delete files is above some threshold or the count of records in delete files is above some threshold. Unfortunately Trino cannot be used for this today because $files
doesn't show delete files.
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
Maybe @electrum @findepi or @findinpath can help out here.
Also @0xffmeta could you rebase?
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua