Fable
Fable copied to clipboard
add missing unicode categories in python library
This adds the missing unicode categories to fix the error ValueError: Fable error, unknown Unicode category: Ps when calling for example, Char.IsLetterOrDigit with left paren '('.
I used the values defined in the referenced doc https://docs.microsoft.com/en-us/dotnet/api/system.globalization.unicodecategory?view=net-6.0, and also found that No was assigned to UnicodeCategory.OtherLetter instead of UnicodeCategory.OtherNumber.
I'm not sure how to test the surrogate category, as I hit an error I left a note about that I didn't get a chance to look into.
The test data was created by running the following script
import sys
import unicodedata
from collections import defaultdict
unicode_category = defaultdict(list)
for c in map(chr, range(sys.maxunicode + 1)):
unicode_category[unicodedata.category(c)].append(c)
for value in unicode_category.values():
c = value[0]
e = c.encode("unicode_escape")
print(repr(e), unicodedata.category(c))
which groups chars by category and then prints each category with a sample value.