Robin Leroy

Results 189 comments of Robin Leroy

Yes, I somehow got distracted from ICU4[CJ] matters last week and dropped this ball. I intend to get back to this on Monday, please poke me with a sharp stick...

Exciting Development: While testing the new monkeys, I came across a string which exposes a bug in my rules for LB19a. Somehow the old monkeys never came up with such...

@markusicu Status report: 70089cd68383daeb611017393708b54a907f17d0 is green (except for clang warnings which I am fixing in the next commit), so if this is blocking too many things you could run with...

With https://github.com/unicode-org/icu/pull/3028/commits/c96eb89656e806c008ada83cef7b380b84ad6608, the old monkeys would quickly have caught the issue in the rules at https://github.com/unicode-org/icu/pull/3028/commits/6ae4c111843daf671bb4777ed3d6f69374df43d4, e.g. ``` C:\Users\robin\Projects\Unicode\icu\icu4c\source\test\intltest\rbbitst.cpp:4559 Break expected but not found at index 165. Parameters to reproduce:...

Status report: I went back to my original approach of using look-ahead break rules. Thanks to a comment from @aheninger, > Look ahead rules with ambiguous preceding context can lead...

Aside from the fact that I still have edge cases to deal with, if we want to rely on the old monkeys, perhaps we should feed them more bits; it...

Alright, now using ranlux48 (I had intially gone with mt19937_64, but it turns out it is not great statistically, which I do not really care about, and it has a...

> I was able to figure out the issue with the batch of rules I had added in 9782d0d60661b8970848dd6a9c271fe2651da8e4, and to fix the BA situation as well. Correction: I was...

> could explain to me the missing ranges in ScriptExtensions.txt and Scripts.txt? Scripts.txt has no missing ranges (if anything were missing, it would violate an invariant test that assigned characters...

> ScriptExtensions.txt only has entries for characters that have more than one script in its Script_Extensions, thus the scx={Xsux} character and the scx={Pcun} characters are gaps. The scx situation is...