aws-sdk-perl
aws-sdk-perl copied to clipboard
S3 PutObject URI encodes key?
Does PutObject()
URI encode the key string? These files are being uploaded to S3 with the key containing percent encoded characters:
my $output = $s3->PutObject(
Bucket => $bucket,
Key => 'example picture.jpg',
ACL => 'public-read',
Body => $data,
);
example picture.jpg
is actually uploaded to S3 as example%20picture.jpg
. Which means normal paths to it don't work because the URI becomes: example%2520picture.jpg
Not sure if this is a Paws issue or an AWS one :smile:
Hi. This is definitely a Paws issue that needs looking into. The should be happening inside of Paws::Net::RestXmlCaller, line 67. I if you add a space to the safe characters q[^A-Za-z0-9\-\._~ /]
, we won't encode the spaces in the URLs.
I'll try to look into S3 documentation to find which characters should and shouldn't be uri encoded. Any pointers welcome :smile:
Got it. I found the AWS recommendations.
I think you could either: encode anything that AWS say isn't a "Safe Character", so line 67 becomes something like this:
$vars->{ $att_name } = uri_escape_utf8($call->$att_name, q([^0-9a-zA-Z!_.*'()]);
Or maybe blacklist the "Characters That Might Require Special Handling" and "Characters to Avoid":
$vars->{ $att_name } = uri_escape_utf8($call->$att_name, q([&$@=:+,?\\{\^}%>\[\]`~<#|]);
(the code is untested). Given the AWS recommendations, it doesn't appear to be "wrong" to encode whitespace. The first example will and the second example won't but either seem like valid approaches to me. What do you think?
From the doc, it looks like nothing should be escaped in the key (everything is a "may", "should", "might"...). I'm curious what others are doing with the keys... I suspect that it's up to the code that treats the key of handling "special cases", that is, if your code is going to interpret the key as the name of a file, then some symbols will cause problems converting the key to a filesystem path. For Paws, I'm thinking that key names should go to S3 unfiltered, but I'd try to see what boto, and the ruby SDK do with keys.
Hi, We just bumped into this issue. We upload S3 objects from our perl app using Paws, and from a java app using a java library (I don't know which but I will try to find out). The java lib does not encode spaces, but Paws does, leading to mismatches.
I agree with you that the AWS docs are very vague about this, so my suggestion would be to minimize changes and just add the space to the list of safe characters. This should be safe and would solve our issue and also potential issues of double encoding by browsers etc. So the line in Paws::Net::RestXmlCaller would become:
$vars->{ $att_name } = uri_escape_utf8($call->$att_name, q[^A-Za-z0-9\-\._~/ ]);
What are your thoughts?
@sven-schubert: You may want to look at this issue, as you are currently working on S3
We just ran into this issue. We have two systems that are putting objects into S3, one written in Golang using the official golang aws sdk, and one written in Perl using PAWS.
Our keys contain email addresses, with the @
intact.
The golang sdk is uploading them as is, while perl is encoding the @
into %40
and breaking our downstream processes that are trying to find the messages.
@pplu Is there a workaround or any way to address this?
@veqryn : in this issue a couple of workarounds are suggested (https://github.com/pplu/aws-sdk-perl/issues/111#issuecomment-300135620) and (https://github.com/pplu/aws-sdk-perl/issues/111#issuecomment-250996388). I'd love a pull request with a fix, since I didn't arrive to a conclusion about what has to get encoded or not.
Those workarounds look like a modification of the PAWS source right?
Is there anything a client using this library can do as a workaround without modifying the library's source?
This should be fixed together with the fix for #221 (PR in #265) - which ensures we only encode once in the URI. tests have been added to check this: t/s3/uri_encoding.t