hyperglot
hyperglot copied to clipboard
Use icu.UnicodeSet instead of custom code in tools/readers.py parse_unicode_set
If you add PyICU to requirements, then you can rewrite parse_unicode_set from tools/readers.py trivially:
import icu
def parse_unicode_set(s):
return sorted(list(icu.UnicodeSet(s)))
Super handy, thanks! A lot of the tools/ scripts were for initial data aggregation but should we re-use some of that parsing I'll update. The signature of the two is a bit different and I think paths have changed since, so... yea :)
But using icu already in the comparison code for CLDR/SLDR, so thanks for the pointer 👍
Leaving this open as reminder.