Robin Leroy comments

Results 189 comments of


                                            Robin Leroy

揚琴 slow 2..4

> It looks like you need to either add these piecemeal to the `$NonOtherLetterIdeographs`, or else define that set as `[\p{Ideographic} - \p{gc=Lo}]` rather than testing that it's equal to...

14 letters with palatal hook

> Please add the DoNotEmit data too. Done. If we are going to get much more of these kinds of proposals, we need to find a way to incorporate validation...

adjust cldr/*BreakTest generation for Unicode 15.1

> More generally, are the diffs in lines related to QU intentional? No. Blame the author of #456. It looks like SegmenterDefault is correct, and SegmenterCldr is wrong. This should...

adjust cldr/*BreakTest generation for Unicode 15.1

> will UTC and CLDR line break rules be the same starting with Unicode 16? They probably will be. But even if they were not, > Do we need a...

adjust cldr/*BreakTest generation for Unicode 15.1

> We do have that test file in ICU: > https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata/LineBreakTest.txt That is the file from the UCD, not a CLDR version. This one, generated by yours truly in the...

adjust cldr/*BreakTest generation for Unicode 15.1

This seems reasonable in principle; but I will note that we do not publish the unicodetools version of the segmenter rules as part of the UCD, so a no-op there...

Modifier capital S

Re the invariant test failure, `$caseOverlap` is a manually-maintained set of 267 modifier letters and counting, it should probably be expressed in terms of properties instead.

Retire ToolUnicodeProperties & UCD.java

See also https://github.com/unicode-org/unicodetools/issues/484.

UTC-176-C35 Six compound tone diacritics

> Hmmm. The description does include a link to a UTC decision, namely https://www.unicode.org/L2/L2023/23157.htm#176-C35, but the Pipeline / UTC decision check fails. Yeah this regex is too strict, it expects...

UTC-176-C35 Six compound tone diacritics

Yes, I think this is a fairly clear-cut case of « diacritics are diacritics ». (For, in contrast, a case where the Diacritic property is not clear-cut, see the very...