datacube-core
datacube-core copied to clipboard
Unable to Work with Temporary AWS Security Credentials
Expected behaviour
We should be able to use temporary/expiring/refreshing AWS security credentials while running ODC code. Eg. via the AWS AssumeRoleWithWebIdentity - AWS Security Token Service API call.
This can handled automatically by boto3.
When you do this, Boto3 will automatically make the corresponding AssumeRoleWithWebIdentity calls to AWS STS on your behalf. It will handle in-memory caching as well as refreshing credentials, as needed.
Actual behaviour
ODC code accessing AWS APIs (like S3) work initially when the correct environment variables are set, but start and continue to fail once the credentials expire, which for OIDC/WebIdentityProvider defaults to 2 hours. They are never renewed.
This is inadequate for long processing jobs and for server applications.
More details
There is a comment https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/aws/init.py#L468-L472 indicating that this is known behaviour when using datacube.utlis.aws.configure_s3_access()
.
Fixing this may be as simple as removing most of the custom AWS setup code we have... as I believe some of it is no longer required with better support of AWS in GDAL and rasterio.
Environment information
-
Which
datacube --version
are you using? 1.8.17 -
What datacube deployment/enviornment are you running against?
@benjimin
Is there a reason you can't used IAM credentials (which autorenew)? (Can be configured in the datacube.conf config file or via environment variables.)
they only auto-renew when using boto3
library, once inside GDAL they no longer do. What's more, it's really tricky to tell why read failed, there is no clear "expired credentials error". One can solve this by running a service thread that copies frozen creds from boto3 to GDAL on a regular interval. Since we don't really have a place to put "IO driver state for a given dc.load
", we use globals and hacky Dask code injections to make authentication work at all.
Proper solution will require introducing "shared state" for dc.load
io driver, right now we can't even access two different buckets with two different sets of credentials from the same process or on the same Dask cluster.
There's several types of AWS Credentials. What I'm interested in now is using the AWS AssumeRoleWithWebIdentity, which is similar to OIDC. Support was added to GDAL in 3.6 (November 2022).
KK:
once inside GDAL they no longer do
This used to be the case, but I think it was fixed in 3.1.0.
- /vsis3/: for a long living file handle, refresh credentials coming from EC2/AIM (#1593) https://github.com/OSGeo/gdal/issues/1593
KK:
right now we can't even access two different buckets with two different sets of credentials from the same process
I'm not sure about how rasterio or ODC fits in, but GDAL since 3.5 has support for using a configuration file to define per path prefix credentials or options.
should have said "the way we put those into GDAL using rasterio
can not be refreshed and can not use multiple sets of credentials". Does rasterio
support talking to GDAL in a way that allows AssumeRoleWithWebIdentity with auto-refresh? That would be the first thing to figure out.
Looks like rasterio's AWSSession is not aware of AWS_WEB_IDENTITY_TOKEN_FILE
environment variable, and just pushes frozen credentials obtained with boto3
into GDAL. You'll need custom AWSSession class that doesn't use boto3 to get a set of frozen creds and instead allow GDAL to deal with it internally.
https://github.com/rasterio/rasterio/blob/b8911b29001c0a2e67320741770bc35b260ed88e/rasterio/session.py#L318-L335
Maybe raise an issue in rasterio
, something like "Support delegating AWS credentialization to GDAL".
In datacube custom Session would be plugged in here based on some config setting:
https://github.com/opendatacube/datacube-core/blob/f3323b95eebae634c5e20d3af1d428d4f4a8ea9b/datacube/utils/rio/_rio.py#L90
Thanks for the pointers!