Markus Scherer comments

Results 324 comments of


                                            Markus Scherer

Upgrade UnicodeProperty foundation

Mark started a related doc: “[Modernizing UnicodeTools](https://docs.google.com/document/d/1Cp8vdUpcXTzOGDbeSsgl2f9vcLrP3OJaPsgQbfHfQAg/edit)”

Upgrade UnicodeProperty foundation

@macchiati would you consider interface UnicodeProperty deprecated with the v3 IndexUnicodeProperties? Your description sounds like that might just be for ease of transition of older code?

Workflow: Publish Data

Looks promising! Some thoughts: > Include the final release mode? That could be messy, because the /Public/draft/ folder structure (all files for a version under one root) differs from that...

Workflow: Publish Data

Please un-delete the old scripts for now, so that we can try out the new workflow while we still have the old scripts for backup.

ICU-23038 Unicode 17 beta

@eggrobin ICU4C properties data is in, so you should be able to start work on C++ RBBI. On my machine, I currently see RBBI and collation test failures: ``` |...

ICU-23038 Unicode 17 beta

@eggrobin I went through the rest of the update instructions. C++ tests still fail with rbbi, which is probably expected. (Different set of failures from before.) I haven't run Java...

> I am assuming that these Java test have nothing to do with RBBI I assume the same. @richgillam could you please take a look at > Error: PersonNameConsistencyTest.TestPersonNames:107->AbstractTestLog.errln:50 Failure...

ICU-23038 Unicode 17 beta

> > @richgillam could you please take a look at > > > Error: PersonNameConsistencyTest.TestPersonNames:107->AbstractTestLog.errln:50 Failure in km.txt: Found 20 errors. > > > > ? > > @markusicu @eggrobin...

ICU-23038 Unicode 17 beta

@richgillam Locally I turned on VERBOSE_OUTPUT in PersonNameConsistencyTest and got these lines for Khmer: ``` Expected 'ស្តូបើ, ហ្. ហេ. មី.', got 'ស្តូបើ, ហ្សា. ហេ. មី.' at line 606 Expected 'ស្តូបើ...

ICU-23038 Unicode 17 beta

It looks like the code is abbreviating the given name differently, and sometimes the surname. - given=ហ្សាហ្សីលៀ - expected=ហ្. (HA COENG) - got=ហ្សា. (HA COENG SA AA) Khmer characters are...