polars icon indicating copy to clipboard operation
polars copied to clipboard

read_parquet/scan_parquet using gcs: enable directly providing access token in storage options

Open erikamundson opened this issue 1 year ago • 1 comments
trafficstars

Description

object_store provides a way to directly provide the gcp access token using with_credentials. However, as polars only supports config options supported by with_config there is no way for us to directly authenticate with a token, only a service account path or json.

We can provide the token directly with the pyarrow reader but that is not available for scan_parquet which is more useful in many cases. We can also provide the token directly using polars < 0.20.0 with fsspec, but we would like to stay on the most recent release as much as possible.

I'm not sure what the best way to implement this would be, but it would be useful for cases when we generate the access token not from a service account or application default credentials, such as an oauth flow.

erikamundson avatar Dec 19 '23 22:12 erikamundson

+1 for this. From your issue in the Arrow-repository, it looks like we have very similar to challenges - we too have a JupyterHub environment with access and refresh tokens generated through an OAuth flow.

mallport avatar Feb 20 '24 11:02 mallport

+1 - in a corporate environment without an option to use a JSON service file, this is key.

Skumin avatar Jun 20 '24 22:06 Skumin