appengine-gcs-client
appengine-gcs-client copied to clipboard
Inconsistent handling of unicode for open / listbucket / delete
Hi,
The cloudstorage.listbucket(..)
gives you GCSFileStat objects, which will decode UTF-8 encoded object names for you so that GCSFileStat.filename is a unicode instance. This is nice.
But passing a unicode instance to the open or delete functions gives you a KeyError if the string includes non-ASCII characters.
Traceback (most recent call last):
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/bottle.py", line 862, in _handle
return route.call(**args)
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/bottle.py", line 1732, in wrapper
rv = callback(*a, **ka)
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/wsgi.py", line 32, in create_utf8
return create_file(u'Señor') #.encode('utf-8'))
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/wsgi.py", line 38, in create_file
with cloudstorage.open(dest, 'w') as fh:
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/cloudstorage/cloudstorage_api.py", line 91, in open
filename = api_utils._quote_filename(filename)
File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/cloudstorage/api_utils.py", line 94, in _quote_filename
return urllib.quote(filename)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib.py", line 1263, in quote
return ''.join(map(quoter, s))
KeyError: u'\xf1'
It would be nice if the cloudstorage library automatically encoded unicode object names to UTF-8, as well as decoding them.
For example, in this test project which creates objects with UTF-8 encoded names, the filename has to be encoded again when deleting all objects in a bucket.
Thank you,
David B.