characters
characters copied to clipboard
`Characters.operator ==` should document that it doesn't compare normalized forms
I expected that Characters.operator ==
would compare normalized forms, but it doesn't. (See https://stackoverflow.com/q/64094438/.)
If it intentionally doesn't, it would be nice if the operator ==
documentation explicitly stated that (and ideally recommended what people should do to normalize Unicode strings instead).
This package does exactly one thing: Grapheme cluster segmentation in the default locale.
The documentation for ==
definitely needs fixing (what's it even saying?), but the fix will be to say that characters are equal if their underlying strings are equal, which means containing the same sequence of UTF-16 code units.
(Or, what it tries to say now, that the Characters
iterable values contain the same sequence of grapheme cluster substrings, which amounts to the same thing.)