the autojson can't correctly handle chinese
whatever the response.charset or the charset of response.content_type I set, the chinese characters are always converted to unicode characters. so I can only turn of the autojson, and output the json string by myself. json.dumps(<dict_to_convert>, ensure_ascii=True) performs like the autojson(unicode) json.dumps(<dict_to_convert>, ensure_ascii=False) is what i want.(bytestring) maybe bottle needs to provide a choose.
response.content_type = 'application/json; charset=' + response.charset result_data = json.dumps({"code": "1", "message": "ok", "data": response_data},encoding=response.charset,ensure_ascii=False) return result_data
Can you give some example code that shows what the problem is? What do you mean "converted to unicode characters"? Chinese characters only exist as unicode characters.
@app.route('/api/hello', method='GET')
def hello():
return {"hello": "世界"}
the output is
{"hello": "\u4e16\u754c"}
the Content-Type in response header:
Content-Type:application/json
@app.route('/api/hello', method='GET')
def hello():
response.content_type = 'application/json; charset=UTF-8'
return {"hello": "世界"}
the output is
{"hello": "\u4e16\u754c"}
the Content-Type in response header:
Content-Type:application/json
Python 2 or 3?
You realize that that output is perfectly acceptable behavior, right? Any reasonable json parser will parse {"hello": "世界"} and {"hello": "\u4e16\u754c"} to the same thing.
I think that defaulting to ensure_ascii=True is sane behavior, since it's more likely that tools choke on unicode input than it is that their json parser is non-compliant. Having said that, it's obviously awkward for you not to be able to read the chinese characters from a raw request.
What do you want the api to look like to tell bottle your choice?
As a hack, you can currently do:
app = bottle.default_app() # or your app object
app.plugins[0].json_dumps = lambda: *args, **kwargs: \
json.dump(*args, ensure_ascii=False, **kwargs).encode('utf8')
python2. I know the two things is the same thing.I prefer to output the content in bytestring('\xe4\xb8\x96\xe7\x95\x8c')("世界"),the unicode('\u4e16\u754c') in output of API is not readable, and this makes me feel not very good. I use a decorator with json.dumps(xxx, ensure_ascii=True) temporarily sovled this. your solution is also very good. tks. ^.^
By that you presumably meant to say "I'd prefer to output UTF-8 encoded characters than \u-escaped characters. In both cases the output is a bytestring
JSON is defined as UTF-8 encoded text, so it should be fine to include non-ascii characters in JSON strings and there is no need for these unicode escape sequences. It would certainly save some bytes and be easier to read for humans, and human readable API responses are a good thing in my opinion.
If both representations are equal and the human readable one actually saves some bytes and involves no additional overhead, I'd say we should switch the default to the human readable representation.
I'd like to introduce two new config parameters:
-
autojson.ascii(default: false). If true, convert all non-ascii characters to escape sequences. -
autojson.compact(default: false) If true, return compact (but less readable) JSON with less whitespace.
And a new default behavior: The autojson plugin should return human readable and developer friendly JSON by default (indented, with unicode characters in strings).
This might break (broken) tests that test against hard-coded JSON strings, but should not break any real world applications or integrations. We can still discuss if the default values for the configuration should reflect the current behavior, and warn about the changing defaults with depr(0,13) so we can change it in 0.14.
http://i.imgur.com/0PVtmcG.gif
tks for reply. @defnull I'm glad my idea can be adopt.Bottle is quite good to use.
I just committed a long overdue ConfigDict patch and a more sophisticated autojson config to master. This feature request should now be quite straight forward to implement. See Bottle.init() and JsonPlugin.