tornado.locale.Locale.get_closest() not getting closest match
The tornado.locale.Locale.get_closest() (and as such by extension tornado.locale.get) function is a bit malformed. It tries to match two character language codes directly against the frozenset in which the five character language codes are stored, which obviously fails. As a result it can only get exact matches and will return the default locale when using two character codes.
@classmethod
def get_closest(cls, *locale_codes):
"""Returns the closest match for the given locale code."""
for code in locale_codes:
if not code:
continue
code = code.replace("-", "_")
parts = code.split("_")
if len(parts) > 2:
continue
elif len(parts) == 2:
code = parts[0].lower() + "_" + parts[1].upper()
if code in _supported_locales:
return cls.get(code)
if parts[0].lower() in _supported_locales:
return cls.get(parts[0].lower())
return cls.get(_default_locale)
Specifically this part:
if parts[0].lower() in _supported_locales:
return cls.get(parts[0].lower())
I wrote the following simple function in my own code to bypass this problem but I bet someone else can write it a bit nicer into the intended function:
locale = self.request.headers.get('Accept-Language')
if locale:
for l in tornado.locale.get_supported_locales():
if locale == l.split("_")[0]:
self.locale = tornado.locale.get(l)
Bump - not sure why this has not been fixed.
get_browser_locale() is not providing the correct closests match. If I have de_DE in my locale files and Firefox returns de,en-US;q=0.7,en;q=0.3 in the Acceptable Languages Header, it should use the de_DE locale as the closest match, but it doesn't. It defaults to English.
Currently Tornado does this:
| header | supported | match |
|---|---|---|
| de_DE | de_DE | True |
| de_DE | de | True |
| de | de | True |
| de | de_DE | False |
That last value should be True
It seems header locale format vs unix locale format is being mixed.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language#Directives
Examples Accept-Language: de Accept-Language: de-CH Accept-Language: en-US,en;q=0.5
In the other hand, get_browser_locale inside RequestHandler class ends calling Locale.get_closest() where the dash is substituted by underscore and a system locale is returned.
code = code.replace("-", "_")
parts = code.split("_")
I would expect get_browser_locale to return the "Accept-Language" header thats it. Instead is returning a system locale code. Then another method could set the locale based on the header doing the logic to match the best locale available.
It seems header locale format vs unix locale format is being mixed.
Yeah, this is some old code that makes some questionable design decisions. It's not very principled in how it handles different locale formats.
I would expect get_browser_locale to return the "Accept-Language" header thats it.
You can already do that: self.request.headers.get('Accept-Language'). The whole point of get_browser_locale is to find the most preferred Locale object from tornado.locale that matches what's in the header. If you're using some other localization system instead of tornado.locale (which is perfectly fine!), just ignore this method and go straight to the request headers yourself.