trino icon indicating copy to clipboard operation
trino copied to clipboard

Upgrade Glue Client to AWS SDK v2

Open ShubhamChaurasia opened this issue 2 years ago • 15 comments

Description

This PR updates the glue client to AWS SDK v2. It mainly converts the GlueHiveMetastore and TrinoGlueCatalog used by respectively trino-hive and trino-iceberg to v2.

  • The packages have been relocated from com.amazonaws.services.glue to software.amazon.awssdk.services.glue. Creation of clients and requests have been moved to builder pattern.

v1 version

new DeleteDatabaseRequest().withName(database)

v2 version

DeleteDatabaseRequest.builder().name(database).build()
  • In v1 the async client extended the sync client and hence had both the sync and async methods. In v2, the async client only has the async methods which return CompletableFuture. In order to use the async method for a sync call, it needs to be blocked using join() or get(). As this is a common use case the handling has been added as a utility to io.trino.plugin.hive.metastore.glue.AwsSdkUtil.

  • The request handlers are converted to use the ExecutionInterceptor which is used to intercept the request and perform several actions during the lifecycle of the request.

  • Converting MetricsRequester to MetricsPublisher to collect metrics and store them in GlueMetastoreStats. Some of the metrics have been changed. Below is the result.

v1 version

select "awsclientexecutetime.alltime.avg", "awsclientexecutetime.alltime.count", "awsrequesttime.alltime.avg", "awsrequesttime.alltime.count", "awsclientretrypausetime.alltime.avg", "awsclientretrypausetime.alltime.count", "awsthrottleexceptions.totalcount", "awsrequestcount.totalcount", "awsretrycount.totalcount", "awshttpclientpoolavailablecount", "awshttpclientpoolleasedcount", "awshttpclientpoolpendingcount"  from "trino.plugin.hive.metastore.glue:name=hive,type=gluehivemetastore";

Result:

"awsclientexecutetime.alltime.avg","awsclientexecutetime.alltime.count","awsrequesttime.alltime.avg","awsrequesttime.alltime.count","awsclientretrypausetime.alltime.avg","awsclientretrypausetime.alltime.count","awsthrottleexceptions.totalcount","awsrequestcount.totalcount","awsretrycount.totalcount","awshttpclientpoolavailablecount","awshttpclientpoolleasedcount","awshttpclientpoolpendingcount"
"505.086064688","125.0","425.98159896799996","125.0","NaN","0.0","0","125","0","5","0","0"

v2 version

select "awsservicecallduration.alltime.avg", "awsservicecallduration.alltime.count", "awsapicallduration.alltime.avg", "awsapicallduration.alltime.count", "awsbackoffdelayduration.alltime.avg", "awsbackoffdelayduration.alltime.count", "awsthrottleexceptions.totalcount", "awsrequestcount.totalcount", "awsretrycount.totalcount", "awshttpclientpoolavailablecount", "awshttpclientpoolleasedcount", "awshttpclientpoolpendingcount"  from "trino.plugin.hive.metastore.glue:name=hive,type=gluehivemetastore";

Result:

"awsservicecallduration.alltime.avg","awsservicecallduration.alltime.count","awsapicallduration.alltime.avg","awsapicallduration.alltime.count","awsbackoffdelayduration.alltime.avg","awsbackoffdelayduration.alltime.count","awsthrottleexceptions.totalcount","awsrequestcount.totalcount","awsretrycount.totalcount","awshttpclientpoolavailablecount","awshttpclientpoolleasedcount","awshttpclientpoolpendingcount"
"920.0434782608696","23.0","924.3913043478261","23.0","0.0","23.0","0","23","0","1","0","0"

Co-authored-by: Karan Makhija[email protected]

Release notes

( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. (x) Release notes are required, with the following suggested text:

`hive.metastore.glue.pin-client-to-current-region` is deprecated. Current region will be inferred automatically if running on EC2 machine.

ShubhamChaurasia avatar Jun 13 '23 01:06 ShubhamChaurasia

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

cla-bot[bot] avatar Jun 13 '23 01:06 cla-bot[bot]

I'm not sure why the original code used the async version of the interface, but we can use the synchronous GlueClient in the new version.

electrum avatar Jun 13 '23 04:06 electrum

I'm not sure why the original code used the async version of the interface, but we can use the synchronous GlueClient in the new version.

@electrum thanks for the review. I am working on addressing the other comments. On this one, I think the async client originally existed because of the batchUpdatePartitionAsync, batchGetPartitionAsync, batchCreatePartitionAsync methods in GlueHiveMetastore which operate on the batches of partitions (concurrently). I think it is appropriate for them to be async. With sync client we will have to explicitly do these operations in multiple threads. WDYT ?

TrinoGlueCatalog in iceberg module do not have any such requirements yet, and can work with sync client. I can convert that.

ShubhamChaurasia avatar Jun 13 '23 13:06 ShubhamChaurasia

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

cla-bot[bot] avatar Jun 15 '23 09:06 cla-bot[bot]

That makes sense. Let's change Iceberg to use synchronous, but use async for Hive. I note that the V1 client actually used an executor internally, but with V2 we can use the async client and reduce thread usage (but with higher fixed thread usage and overhead due to Netty).

electrum avatar Jun 21 '23 21:06 electrum

Sure @electrum, changed glue calls in iceberg module to use sync client.

ShubhamChaurasia avatar Jun 27 '23 12:06 ShubhamChaurasia

@ShubhamChaurasia pls rebase on top of master to address the code conflicts.

findinpath avatar Jul 03 '23 11:07 findinpath

@findinpath rebased and updated.

ShubhamChaurasia avatar Jul 04 '23 06:07 ShubhamChaurasia

@electrum @findinpath rebased again, could you please check ?

ShubhamChaurasia avatar Jul 18 '23 06:07 ShubhamChaurasia

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Jan 15 '24 17:01 github-actions[bot]

👋 @ShubhamChaurasia @findinpath @electrum .. could you please collaborate and figure out if this PR and approach is still valid and should be implemented.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

mosabua avatar Jan 15 '24 18:01 mosabua

@ShubhamChaurasia pls rebase. We should be now in good shape to review the changes to eventually land this PR.

findinpath avatar Jan 15 '24 19:01 findinpath

@mosabua @findinpath thanks for letting me know, will give rebase a try.

ShubhamChaurasia avatar Jan 16 '24 13:01 ShubhamChaurasia

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Feb 07 '24 17:02 github-actions[bot]

@ShubhamChaurasia do you still plan to continue with the work needed to land this PR?

findinpath avatar Feb 12 '24 13:02 findinpath

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Mar 05 '24 17:03 github-actions[bot]

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

github-actions[bot] avatar Mar 27 '24 17:03 github-actions[bot]

This PR is made redundant by the merged update with much more improvements in https://github.com/trinodb/trino/pull/20657

mosabua avatar May 17 '24 22:05 mosabua