hadoop-connectors
hadoop-connectors copied to clipboard
Default credentials also when running outside of Google
Currently to run gcs-connector on premise we have to download and set up an explicit OAuth 2.0 private key (as explained in the docs). It would be easier if gcs-connector could use gcloud's default credentials (that in our case we already have to set up for other purposes anyways).
Second this request! :+1: This would make it much easier to integrate the gcs-connector with https://github.com/broadinstitute/gatk, since for the convenience of our users we've standardized on using gcloud default credentials for GCS access.
Explanation from @dennishuo of what needs to be done to implement this feature: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/52#issuecomment-305675143
@medb @dennishuo Has there been any progress on making the gcs-connector respect default credentials? We're having authentication problems when we want to run on a gcs node which is configured using google pipelines API. We don't want to be forced to pass a keyfile around as a job input but we're not sure how to authenticate the gcs-connector without one. Is there a way to make it authenticate using the metadata-server instead?
@lbergelson in fact, by default GCS connector uses GCE metadata server for authentication, so if you do not set any auth configuration options it should work out of the box. It works this way on Dataproc VMs too.
@medb Ack. You're right. I apologize. Our authentication problems were due to some confusing issues with missing roles that were hidden because a stacktrace wasn't fully output in the error message I saw. Sorry to blame the connector!
Does it now understand default authentication as well?
GCS connector 3.0.0 (expected release in Q1 2023) will support Application Default Credentials: https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/CONFIGURATION.md#authentication