seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Feature][Hive Sink] Add support for AWS Glue Data Catalog as metastore in SeaTunnel Hive Sink connector

Open zhangzhao2010 opened this issue 4 months ago • 2 comments

Search before asking

  • [x] I had searched in the feature and found no similar feature requirement.

Description

Current Behavior

Currently, the SeaTunnel Hive Sink connector only supports traditional Hive metastore through the metastore_uri configuration parameter. When using AWS EMR with Glue Data Catalog as the metastore, the connector fails to work because it doesn't support this configuration.

Requested Feature

Add support for AWS Glue Data Catalog as a metastore option in the SeaTunnel Hive Sink connector.

Proposed Solution

Extend the Hive Sink connector to support AWS Glue Data Catalog as a metastore option by:

  1. Adding a new configuration parameter like use_glue_catalog: true to indicate that AWS Glue Data Catalog should be used instead of a traditional Hive metastore
  2. Supporting AWS credentials configuration for Glue access
  3. Implementing the necessary code to interact with Glue Data Catalog API instead of the Hive metastore when this option is enabled

Benefits

  • Seamless integration with AWS EMR and Glue Data Catalog
  • Better support for AWS ecosystem
  • No need to run a separate Hive metastore service when using AWS Glue

Additional Context

This feature would be particularly valuable for users who are running SeaTunnel in AWS environments and want to leverage the managed Glue Data Catalog service rather than maintaining their own Hive metastore.

Usage Scenario

I'm using AWS EMR with Glue Data Catalog as the metastore for my Hive tables. I need to synchronize data from Aurora MySQL to Hive tables using SeaTunnel, but the current Hive Sink connector doesn't support Glue Data Catalog integration.

Related issues

No response

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

zhangzhao2010 avatar Sep 12 '25 08:09 zhangzhao2010

any plan for next version 2.14 @zhangzhao2010

triones-adam avatar Oct 02 '25 13:10 triones-adam

maybe can change hive.metastore.client.factory.class to use AWSGlueDataCatalogHiveClientFactory

https://docs.amazonaws.cn/emr/latest/ReleaseGuide/emr-hive-metastore-glue.html

dyp12 avatar Dec 09 '25 08:12 dyp12