django-storages
django-storages copied to clipboard
django-storages 1.14 broke an S3 credential setup that used to work
I've been grappling with an absolutely baffling problem with the S3 credentials within some of the projects I've been recently updating dependencies for, and after almost a day of frustration, I finally figured out what the problem was.
Despite my environment having the correct AWS_SECRET_ACCESS_KEY
, AWS_ACCESS_KEY_ID
, and AWS_SESSION_TOKEN
values, which are set up by a script that works for all my other projects, I was still getting the following error when trying to read files from S3 using django-storages 1.14.2:
An error occurred (403) when calling the HeadObject operation: Forbidden
And the following error when trying to upload files to S3:
An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.
(I'm including these full error messages so that hopefully later googlers can find this issue and learn the solution to their frustrations)
After almost a day of debugging, I finally tracked the problem down to the change made in #1246.
The ultimate issue is that, starting in django-storages 1.6, the AWS_SESSION_TOKEN
value was being read from the environment and used to build the credentials that django-storages would send to S3. And this was true even if AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID were defined in settings.py. But once #1246 came along, that stopped being the case, as the system would now use the access key and secret key from settings.py, and if the session token wasn't also defined there, django-storages would send None
to S3, instead of letting the value from the environment be used.
So as of 1.14, you now either need to define all three of those variables in settings.py, or none of them, when you're using temporary credentials that require the AWS_SESSION_TOKEN. If you define just the access key and secret key in settings, but leave the session token in the environment, django-storages will think it's configured correctly, but it will not be using the session token from the environment, so the secret key and access key it's using will be rejected, leading to the errors above.
It's obviously not ideal to be pulling configuration for a single thing (S3 credentials) from two different sources (django settings and environment vars). But since it used to work, it should either continue working, or explicitly reject a hybrid config by throwing an ImproperlyConfigured
exception, so that users who update to a new version of django-storages don't go through the same headache that I've been dealing with this week.
It's not the case that the security token needs to always be set though, right? Like rejecting the hybrid solution sounds reasonable assuming it can always be detected without false positives.
In general this library shipped sans security token originally so that would break backwards compat going back before I even had this library (10+ years ago).
Sorry for the debugging pain, we've all been there and it's not fun.
I think the best solution would be to change the way django-storages looks for the AWS credentials.
Right now it appears to exclusively pull from django settings, and if one of the three values isn't in there, it sends None
to boto3 for that value. Thich overrides boto3's default behavior of looking for that value in the environment, which is what is breaking my shop's old setup.
You could instead make django-storages look for each credential in the settings, then if it doesn't find the value there, look for it in the environment, and then only if both are missing, send None
to boto3.
This seems like the simplest solution, and it covers all the potential setups that might currently exist in legacy installs. My shop's old system of having the Access Key and Secret Key in django settings, and the Security Token in the env would work. And so would the strategy that I went with for my shop's new setup, which is to not specify the credentials in django settings at all, allowing boto3 to pull them from the environment itself.
I have also had 1.13 pinned for the past five months due to the same issue - it was a very difficult to diagnose as it only occurred in the AWS environment - this is a big problem on Lambda since it automatically sets these env variables. Thanks for the investigation @coredumperror - I should be able to unpin this now.
In general this library shipped sans security token originally so that would break backwards compat going back before I even had this library (10+ years ago).
I don't see how this breaks backwards compatibility, if anything the update from 1.13 -> 1.14 broke it.
CURRENT BEHAVIOUR:
AWS_SESSION_TOKEN env var | AWS_SESSION_TOKEN setting | KEY/ID setting | Result |
---|---|---|---|
SET | UNSET | SET | WONT USE AWS_SESSION_TOKEN |
UNSET | UNSET | SET | WONT USE AWS_SESSION_TOKEN |
SET | SET | SET | USES AWS_SESSION_TOKEN |
PROPOSED BEHAVIOUR:
AWS_SESSION_TOKEN env var | AWS_SESSION_TOKEN setting | KEY/ID setting | Result |
---|---|---|---|
SET | UNSET | SET | USES AWS_SESSION_TOKEN OR THROWS ERROR |
UNSET | UNSET | SET | WONT USE AWS_SESSION_TOKEN |
SET | SET | SET | USES AWS_SESSION_TOKEN |
It is worth noting that this is unlikely to break people's setup because if AWS_SESSION_TOKEN exists but is not used, the KEY/ID likely won't work by themselves because that indicates they are short-term credentials. Maybe there is some edge case where someone sets up their environment with both long term credentials AND AWS_SESSION_TOKEN set but that seems much less likely than the other case which is the default on AWS resources like lambda.
Apologies the fix is in #1336 but I just haven't released it yet.
Great! I look forward to the release.