hyperglot icon indicating copy to clipboard operation
hyperglot copied to clipboard

Use icu.UnicodeSet instead of custom code in tools/readers.py parse_unicode_set

Open twardoch opened this issue 4 years ago • 1 comments

If you add PyICU to requirements, then you can rewrite parse_unicode_set from tools/readers.py trivially:

import icu
def parse_unicode_set(s):
  return sorted(list(icu.UnicodeSet(s)))

twardoch avatar May 24 '21 13:05 twardoch

Super handy, thanks! A lot of the tools/ scripts were for initial data aggregation but should we re-use some of that parsing I'll update. The signature of the two is a bit different and I think paths have changed since, so... yea :)

But using icu already in the comparison code for CLDR/SLDR, so thanks for the pointer 👍

Leaving this open as reminder.

kontur avatar Jun 01 '21 13:06 kontur