harfbuzz Update hb-ucd-table.hh GB18030-2022 Break the composed of 9 characters

Remove character mappings for the following 9 fonts according to GB18030-2022 standard

GB18030-2022

codepoint = f92c, _hb_ucd_dm1_p0_map[740] = 90ce codepoint = f979, _hb_ucd_dm1_p0_map[147] = 51c9 codepoint = f995, _hb_ucd_dm1_p0_map[558] = 79ca codepoint = f9e7, _hb_ucd_dm1_p0_map[684] = 88cf codepoint = f9f1, _hb_ucd_dm1_p0_map[772] = 96a3 codepoint = fa0c, _hb_ucd_dm1_p0_map[128] = 5140 codepoint = fa0d, _hb_ucd_dm1_p0_map[210] = 55c0 codepoint = fa18, _hb_ucd_dm1_p0_map[544] = 793c codepoint = fa20, _hb_ucd_dm1_p0_map[661] = 8612

Jul 19 '23 08:07 kiraskyler

meson not fonud, May I ask if it is a compilation environment issue

Jul 19 '23 09:07 kiraskyler

There was an error during the build process. Can you help me？ @tronical @nico @torarnv @yosh

Jul 20 '23 01:07 kiraskyler

We don't currently tailor the Unicode data. It comes straight from Unicode.

Why do you think we need to remove these 9 decompositions? Please elaborate.

cc @dscorbett @jfkthame

Jul 20 '23 16:07 behdad

GB18030-2022 is a mandatory requirement of the Chinese government, requiring products used in designated fields to pass these certifications In GB18030-2022, there are incompatibilities with unicode. It is recommended to keep unicode. If there are incompatibilities, such as the issue where the 9 characters above should not be displayed, they are considered as other issues, such as input method issues or text issues. Unicode recommends retaining these compatible glyphs for compatibility with previous displays. Although unicode suggests retaining unicode in areas of conflict, these conversions have a significant impact on GB18030-2022. The unicode encoding used by harfbuzz comes from unicode, which inevitably conflicts with GB18030-2022. How to do it specifically may need to be discussed, and there may be a more comprehensive plan

[unicode 22274 disruptive changes. pdf]（ https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf ） [unicode 01314-FAQ-GB18030. chm]（ http://unicode.org/L2/L2001/01314-FAQ-GB18030.htm ） [unicode 23003 gb18030 recommendations. pdf]（ https://www.unicode.org/L2/L2023/23003-gb18030-recommendations.pdf ）

Jul 21 '23 01:07 kiraskyler

Should the upper software handle practices that are not compatible with Unicode? Can the upper software pass certain encodings that should not be compatible when calling font display?

Jul 21 '23 01:07 kiraskyler

Higher-level can definitely override what HarfBuzz does. I like to hear from others, including @jfkthame

Jul 21 '23 01:07 behdad

Is this still needed?

Jul 31 '23 21:07 behdad