babel
babel copied to clipboard
`KeyError` raised in `format_skeleton` when using fuzzy matching
Overview Description
When using format_skeleton, a KeyError is raised despite using fuzzy=True.
Steps to Reproduce
from datetime import datetime
from babel.dates import format_skeleton
dt = datetime(2012, 1, 1, 14, 30, 59)
format_skeleton("G", dt, locale="cs_CZ", fuzzy=True)
From the docs:
fuzzy – If the skeleton is not found, allow choosing a skeleton that’s close enough to it.
The way I read it is that as long as I pass fuzzy=True (which is the default) a skeleton should always be found and I should not need to worry about KeyErrors in that case.
Even the example in the docs makes it seem a KeyError is only thrown with fuzzy=False:
format_skeleton('yMMd', t, fuzzy=False, locale='fi') # yMMd is not in the Finnish locale, an error is thrown
If this is the expected behaviour, I think the docs should state that explicitly. If not, I would suggest adding allow_different_fields=True to the underlying match_skeleton() call.
Additional Information
Babel version: 2.15
@akx What do you think? Should we fix this?
Yeah, I think so. My original commit for fuzzy skeletons (fuzzy skeletons, yikes what a horrible mental image) https://github.com/python-babel/babel/commit/cd70395b0f138b7f60b0866f5b2b164f1a025786 refers to the ICU4J getBestSkeleton function as the reference implementation.
Apparently there's been at least one commit https://github.com/unicode-org/icu/commit/edaebfa64ee673ed580f464123d3afd4deffc693 that has improved that function since – we should maybe look at whether that improvement would in fact fix this?
Looks like the only thing that changed is more format specifier replacements which we could add as well:
if 'k' in skeleton and not any('k' in option for option in options):
skeleton = skeleton.replace('k', 'H')
if 'K' in skeleton and not any('K' in option for option in options):
skeleton = skeleton.replace('K', 'h')
if 'a' in skeleton and not any('a' in option for option in options):
skeleton = skeleton.replace('a', '')
if 'b' in skeleton and not any('b' in option for option in options):
skeleton = skeleton.replace('b', '')
This doesn't fix the KeyError issue though. I'm a bit unsure whether we should fix it or just make it clear in the docs that the function can in fact raise even when fuzzy=True. If we do want to fix this, I think we can implement the recommendation from the tr35 doc for missing fields: https://www.unicode.org/reports/tr35/tr35-72/tr35-dates.html#missing-skeleton-fields
For the time being, we could maybe just document that fuzzy=True may still raise, and fix that in a subsequent version using that recommendation?
Yeah, that sounds like the best option