trino
                                
                                 trino copied to clipboard
                                
                                    trino copied to clipboard
                            
                            
                            
                        Support Parquet Modular Encryption for Trino
Description
This large PR is to support Parquet Modular Encryption for Trino. The PR core logic is migrated from the one in Presto with several changes to bridge the big gaps between APIs in these two repos.
To avoid no trino-hdfs dep in other modules, we pass the trino decryption configs in parquet configs and then translate to parquet flags. Not sure if there is a better way to do this.
For Trino issue: https://github.com/trinodb/trino/issues/9383
Will fix the test and linting failures after the implementation is stable.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required. ( ) Release notes are required. Please propose a release note for me. ( ) Release notes are required, with the following suggested text:
# Section
* Fix some things. ({issue}`issuenumber`)
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Done
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
👋 @my7ym .. please let us know if you need any help and ping us when this is ready for review again.
@amoghmargoor @findepi @electrum can you potentially help here and work on the removal of any Hadoop deps as discussed in the contributor call https://github.com/trinodb/trino/wiki/Contributor-meetings#trino-contributor-call-1-feb-2024
In fact looking at the message in the PR it looks like these removals are done. Maybe all we need is for @my7ym to rebase and then coordinate reviews and changes.
Also @my7ym .. I am not sure if you are on trino slack .. if so please let us know and chime in at https://trinodb.slack.com/archives/CP1MUNEUX/p1706832386414769 or sync up in general with us to continue this work together
@my7ym Great work on reimplementing the Parquet code to not require Hadoop. I'll review this ASAP.
Also fyi @raunaqmorarka
It looks like this code still depends on Hadoop. The config also contains a class name that is loaded at runtime, which is something we avoid in Trino. I think we'll want to have a list of supported KMS implementations that work out-of-the-box with appropriate config.
It would be helpful to have a product test that shows this working end-to-end.
It looks like this code still depends on Hadoop. The config also contains a class name that is loaded at runtime, which is something we avoid in Trino. May I quickly know what class it is?
I think we'll want to have a list of supported KMS implementations that work out-of-the-box with appropriate config.
Good call. Will do.
It would be helpful to have a product test that shows this working end-to-end. May I know where is the best place to add it? Any existing example?
Thanks!
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
@my7ym let us know if you need any help with rebasing and next steps. cc @electrum
@mosabua A bit tied up in the past weeks. Will work on this in next week. Will let you know if I have more questions. Thanks for checking in.
hi @my7ym, I'm working on a project that needs this feature, and I've been following this PR with interest. Is the branch featured in this PR in a workable state (with the understanding that there are Hadoop deps that currently prevent merging into Trino)?
I'm eager to assist if there's any way I can help to move this forward. I'm relatively new to Trino but very enthusiastic about contributing to the community :)
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
@electrum @amoghmargoor how do we want to proceed here?
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.
Can you provide guidance on how to proceed here @electrum ?
@my7ym could you proceed to remove the hadoop dependencies as requested by @electrum and rebase the PR?