gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

[Improvement] Spark Connector Need DelegationTokenProvider for k8s deployment

Open theoryxu opened this issue 1 year ago • 1 comments

What would you like to be improved?

When deploying a spark application on k8s and then connecting multiple Hive Metastore (cluster mode), the spark needs DelegationTokenProvider to get delegate tokens from different HMS in the submitting stage and store them at the UserGroupInformation so that the spark driver can communicate with HMS.

For example, KyuubiHiveConnector contains the KyuubiHiveConnectorDelegationTokenProvider to deal with this case.

Now, the Gravitino Spark Connector depends on the KyuubiHiveConnector. However, the KyuubiHiveConnectorDelegationTokenProvider filters the catalog's implementation, which doesn't work in the above case. In addition, It is only for the hive catalog, not including the iceberg catalog.

The Gravitino Spark Connector needs its DelegationTokenProvider to handle this case and ensure it works well in both hive and iceberg catalogs under a Kerberos environment.

REF: https://github.com/apache/kyuubi/blob/master/extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/KyuubiHiveConnectorDelegationTokenProvider.scala

https://github.com/apache/kyuubi/pull/4560

How should we improve?

No response

theoryxu avatar May 07 '24 08:05 theoryxu

cc @danhuawang ,could our test cluster cover the scene?

FANNG1 avatar May 07 '24 09:05 FANNG1