trino icon indicating copy to clipboard operation
trino copied to clipboard

Support partitioning on nested ROW fields in Iceberg

Open krvikash opened this issue 2 years ago • 28 comments

Description

Fixes https://github.com/trinodb/trino/issues/15109 Support partitioning on nested ROW fields in Iceberg

Release notes

( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. (X) Release notes are required, with the following suggested text:

# Iceberg
* Add support for partitioning on nested ROW fields. ({issue}`15712`)

krvikash avatar Jan 13 '23 13:01 krvikash

@krvikash Could you rebase on upstream to resolve conflicts?

ebyhr avatar Jan 17 '23 09:01 ebyhr

Rebased the PR with latest code.

krvikash avatar Jan 17 '23 10:01 krvikash

Hi, @alexjo2144 | @ebyhr | @findepi | @findepi, when you get time could you please review this?

krvikash avatar Jan 23 '23 13:01 krvikash

Rebased the PR with latest code.

krvikash avatar Jan 26 '23 08:01 krvikash

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Feb 27 '23 18:02 github-actions[bot]

@krvikash @alexjo2144 @findinpath I see no approvals and some yet unresolved conversations. Where are we with the PR?

findepi avatar Feb 28 '23 10:02 findepi

I believe this PR is waiting for #14837 (https://github.com/trinodb/trino/pull/15712#discussion_r1095417177)

ebyhr avatar Feb 28 '23 10:02 ebyhr

Rebased on top of https://github.com/trinodb/trino/pull/14837 's commit and resolved conflicts. This PR (2nd commit of this PR) is ready for review now.

krvikash avatar Mar 01 '23 05:03 krvikash

Fixed CI failure.

krvikash avatar Mar 01 '23 19:03 krvikash

Fixed product test (not able to verify on M1, But hopefully it will be fixed).

krvikash avatar Mar 02 '23 06:03 krvikash

Rebased and addressed comments.

krvikash avatar Mar 07 '23 20:03 krvikash

Thanks, @raunaqmorarka | @findepi for reviewing. Addressed comments and rebased with master.

krvikash avatar Mar 08 '23 17:03 krvikash

Rebased with master. CI was failing with https://github.com/trinodb/trino/issues/16406

krvikash avatar Mar 09 '23 05:03 krvikash

Rebased with master and resolved conflicts.

krvikash avatar Mar 16 '23 16:03 krvikash

Rebased with master.

krvikash avatar Mar 30 '23 12:03 krvikash

Is this still being pursued? It would be a great help for Trino's compatibility with Iceberg.

BrendanPC avatar Oct 05 '23 16:10 BrendanPC

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Jan 17 '24 17:01 github-actions[bot]

:wave: @krvikash @findepi @findinpath @bitsondatadev - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

mosabua avatar Jan 17 '24 19:01 mosabua

@mosabua Please keep this PR opened.

@krvikash Could you rebase on master to resolve conflicts?

ebyhr avatar Jan 23 '24 01:01 ebyhr

@ebyhr (rebased with master)

krvikash avatar Jan 23 '24 07:01 krvikash

(fixed CI failure)

krvikash avatar Jan 23 '24 10:01 krvikash

Rebased on master to resolve conflicts. Talked with Vikash and I'm going to take over this PR.

ebyhr avatar Feb 14 '24 08:02 ebyhr

that's a good question. I am not sure when it's safe to use block.getChildren() vs ColumnarRow I remember @djsstarburst fixing some correctness(?) issues in MERGE by switching from the former to the latter.

IIRC, the reason to use ColumnarRow is that it correctly accounts for the case of blocks containing blocks, where some of the inner block children that are null. @dain is the expert.

djsstarburst avatar Feb 14 '24 14:02 djsstarburst

Using ColumnarRow in the bucket function probably requires https://github.com/trinodb/trino/issues/9848 for any reasonable performance.

findepi avatar Feb 14 '24 14:02 findepi

(some refactoring in TestIcebergV2)

krvikash avatar Mar 07 '24 09:03 krvikash

Rebased on master to resolve conflicts.

ebyhr avatar Apr 03 '24 05:04 ebyhr

Pls rebase the code to resolve conflicts.

findinpath avatar Apr 11 '24 09:04 findinpath

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar May 10 '24 17:05 github-actions[bot]