delta-rs icon indicating copy to clipboard operation
delta-rs copied to clipboard

Better authentication for GCP

Open djouallah opened this issue 3 years ago • 1 comments

currently to connect to gcp, delta table require an environment variable something like this

import os
os.environ["SERVICE_ACCOUNT"] ='secret_API.json'

which is fine when using a notebook for example, but when I tried to deploy it to Google cloud function, it does not works and complain about authentification, error, currently I am just using arrow dataset as it works out of the box , but would love that delta has the same behavior

djouallah avatar Sep 08 '22 13:09 djouallah

Allowing access to cloud storage via fsspec should fix most connection/auth issues but I have no idea how much effort it would be to implement this

sa- avatar Sep 09 '22 13:09 sa-

Arrow by default follows Application Default Credentials to authenticate the user. It would be great if deltalake supports this as well! Using service account keys is no longer the preferred method for authentication in GCP. We should be able to use workload identity federation.

renzepost avatar Nov 28 '22 13:11 renzepost

This is something that we need to add to https://github.com/apache/arrow-rs/tree/master/object_store first. PRs welcome!

houqp avatar Nov 28 '22 16:11 houqp