storage icon indicating copy to clipboard operation
storage copied to clipboard

Special characters in filename causes uploads to fail

Open jet10000 opened this issue 2 years ago • 13 comments

Bug report

When upload "望舌诊病.pdf"

Describe the bug

image

jet10000 avatar Apr 08 '22 03:04 jet10000

Thanks for the bug report @jet10000! We'll take a look.

thebengeu avatar Apr 08 '22 06:04 thebengeu

@thebengeu @jet10000 @alaister As per now for both objectName and bucketName , supabase only allow s3 safe characters as per AWS guideline here

 // only allow s3 safe characters and characters which require special handling for now
 // https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html

rahul3v avatar Apr 12 '22 11:04 rahul3v

S3 supports UTF-8 characters in filenames. However, at the moment, we are very strict with which filenames we allow. I think this is a valid use case to add support for different languages in filenames.

One option is to update the isValidKey function https://github.com/supabase/storage-api/blob/9480891af024396c58045578d16c91778aae67d2/src/utils/index.ts#L76-L80 to allow everything aside from the characters outlined in https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html#object-key-guidelines-avoid-characters

alaister avatar Apr 14 '22 08:04 alaister

I would love support for å,ä,ö.

Love from Sweden

PiddePannkauga avatar Aug 31 '22 19:08 PiddePannkauga

Hi, opendal meets a similiar problem in https://github.com/apache/incubator-opendal/pull/2190 that we have a test case (which passed on most storage platforms from s3, gcs, azalob to hdfs) like the following:

let path = format!("{} !@#$%^&()_+-=;',.txt", uuid::Uuid::new_v4());

Does this case make sense to you? I'm willing to help fix this. I believe that all URL-unsafe characters should be percent-encoded, and the server-side should handle the job of decoding them.

Xuanwo avatar May 02 '23 16:05 Xuanwo

I would love support for ´ I can't use words with accents in my images, which is very common in Spanish.

javierfern03 avatar Jan 16 '24 19:01 javierfern03

Also ÖÄÜ in german

turulix avatar Jan 29 '24 23:01 turulix

I could understand that the isValidKey function looks like because it would be safe if the system accepts only the alphabet. As I'm Korean, there is no good way to safely convert from the Korean characters(hangul) to alphabet. It would be same for the Japanese characters and Chinese characters too. Are there any specific reasons for the function's regex? If the function accepts the encoded characters from encodeURIComponent function, it would be great.

orlein avatar Feb 01 '24 03:02 orlein

MacOS generates screenshot names that don't match the pattern in isValidKey. For example, "Screenshot 2024-01-24 at 12.25.39 AM".

li4man0v avatar Feb 21 '24 07:02 li4man0v

Ops! This issue seems easy to fix but have last for 2 year's. Unbelievable!

jingsam avatar Apr 20 '24 15:04 jingsam

hi everyone

My solution is base64 encoding when uploading file, demo: https://github.com/ThaddeusJiang/supabase-helpers/blob/main/backup_storage_buckets.ts#L71-L73

ThaddeusJiang avatar Apr 22 '24 02:04 ThaddeusJiang