source-han-sans icon indicating copy to clipboard operation
source-han-sans copied to clipboard

My Consolidated Issues with Source Han Sans for January 2023

Open Marcus98T opened this issue 1 year ago • 5 comments

I apologize if this post is very long, but I have several issues while preparing my own fork. My examples given may not be exhaustive because I was only checking JIS Level 1 Kanji and part of JIS Level 2 Kanji.

Remapping of 帰 (U+5E30)

For 帰 (U+5E30), I suggest that the JP form (that is uni5E30-JP) be mapped to all regions. While I like to remove the other glyph (that is actually uni5E30uE0101-JP) currently mapped to CN/TW/HK, I found out it cannot be removed because it is a necessary part of Adobe Japan1-6.

Screenshot 2023-01-10 at 19 09 15

Here is a comparison of Mac typefaces with this character: Screenshot 2023-01-10 at 19 53 57 Screenshot 2023-01-10 at 19 54 13

You can see the first stroke in almost all SC fonts is a straight line and not curved. The exception is the TC fonts which the first stroke is curved inward, but there is no such glyph for that in Source Han Sans/Serif because they're off-scope for TW and HK use and so I suggest the TW/HK forms also follow the JP form.

While the current CN form is supported by the Unicode charts (as of 14.0, not sure about 15.0 and later), I think for consistency sake, we just use the JP form. Screenshot 2023-01-10 at 19 14 01

Unnecessary and similar glyphs

Further adding on to what’s reported in #313 and #326:

The usual components to look out for, as mentioned in these issues, are 羊, 豆, 登, 鼓 (the left part), 辛, 幸 and also 戶 and 子. JP is circled in blue and the CN is circled in red. ~I suggest that we adopt CN forms for all locales for better consistency. I am not going to go into detail as there are too many characters to list and this was reported many times before.~ EDIT: I retract this statement.

Screenshot 2023-01-10 at 23 10 31

I have new components to share, however.

Components such as 五 and 吾 might need to be unified to the JP form where the third stroke is 90 degree angled. Examples: 五, 伍, 吾, 悟, 梧, 語, 语 (because it should be consistent with the rest even if it's a simplified character) Screenshot 2023-01-10 at 19 18 12

Interestingly, other than 语, the HK locale is all consistent here. The JP locale has some CN forms in it. Please standardize here.

For components with 矛, I’m not sure which form to prefer. The second stroke is unnecessarily different, and I think the JP form should be preferred (circled in blue). Please consider trying to redesign the affected characters so that part can work with all regions. Examples: 矞, 矜, 矟, 矡, 務, 霧 Screenshot 2023-01-10 at 19 28 42

For components with 父, I think the disconnected JP glyph should be chosen for all regions. Examples: 釜, 斧, 爸. Screenshot 2023-01-10 at 19 37 56

Miscellaneous similar characters

While I am not sure, 兇 (U+5147) might need to be unified to the JP form, or have the feet removed in the CN form if it’s not unifiable due to the X component that must be above the 凵 component. Screenshot 2023-01-10 at 20 05 40

画 (U+753B, CN and TW only) are too similar and TW has the unnecessary feet. I suggest remapping to CN for the TW/HK locales and removing the TW glyph. Screenshot 2023-01-10 at 23 15 18

弱 (U+5F31, CN and HK only) looks unnecessarily similar. Suggest to use the HK form for CN/TW locales and remove the CN glyph. Screenshot 2023-01-10 at 23 17 55

The following characters are too similar, so I suggest using JP form for all regions and removing the CN glyphs: 喬 (U+55AC, CN and JP), 衡 (U+8861, CN and JP), 尸 (U+5C38, CN and JP), 燁 (U+71C1, CN and JP; actually I'm not sure about this one though as the bottom strokes might be necessary to differentiate) Screenshot 2023-01-10 at 23 23 29

The two characters, 原 (U+539F) and 源 (U+6E90), should be unified to the JP forms and the CN glyphs be removed. Screenshot 2023-01-10 at 23 42 53

In addition, the other relevant glyphs between CN and JP are mostly following the JP forms (other than the radicals needed to differentiate between those two regions). I suggest unifying 原 to the JP forms. Examples (in yellow): 原, 愿 (CN only), 源, 縓, 塬, 螈 Screenshot 2023-01-11 at 00 04 26

I’m not sure about whether to have an open or a closed loop for 又, 叉, 支 and others. Currently as it stands, CN prefers having closed, JP prefers open. Please consider unifying them wherever appropriate. I will not list examples as there are too many of them to think about. Screenshot 2023-01-10 at 23 45 29

Components with 心 and 必 are inconsistent in itself for JP, CN, TW and HK. Maybe we should just use JP aesthetics for all regions (except for 心 itself)? The circled parts in the examples below show the discrepancies between the JP and the CN forms. Examples: 思, 悲, 恋, 志, 泌, 秘 Screenshot 2023-01-10 at 23 49 23

筵 (U+7B75, JP) has very slight glyph differences between the normal JIS2004 form (in red) and the JIS90 form (in blue). I suggest copying the normal glyph to the JIS90 glyph so they can be 100% identical. 筵 JIS90-JIS2004 comparison

Glyphs that need aesthetic improvement

To add to the ones reported on #359 and #321, 滲 (U+6EF2, CN), 謝 (U+8B1D, TW) has some proportion problems. Please adjust them (below) to follow JP proportions (above). Screenshot 2023-01-10 at 19 00 15

詑 (U+8A51, CN) has a minor stroke issue at the top right (Heavy master only, circled), where it looks a bit too thick compared to the rest of the character. Please adjust accordingly. Screenshot 2023-01-10 at 22 49 42

Several 言 components in CN need to be improved, especially the horizontal bar above the 口, to follow the JP aesthetics. For that horizontal bar to stick out like a sore thumb, it looks a bit ugly. Examples: 記 (U+8A18 CN), 詣 (U+8A63 CN), 証 (U+8A3C CN; Heavy only, ExtraLight is fine), 註(U+8A3B CN), 詫 (U+8A6B CN) Screenshot 2023-01-11 at 00 31 00

For 這 (U+9019, CN), please adjust the top left stroke so that they match JP aesthetics. Current above, mockup below: Screenshot 2023-01-10 at 22 46 57 Screenshot 2022-12-24 at 00 28 58

For 喚 (U+559A, JP) and 渙 (U+6E19, JP), they should be adjusted such that the middle box component cannot touch the bottom 大 component like the other JP glyphs, as circled in the picture. The component needs some consistency here. Also on another note, 寏 (U+5BCF JP) is a bit different (八 instead of 大 at the bottom), so I'm not sure whether to also fix that glyph. Screenshot 2023-01-10 at 22 55 53

響 (U+97FF, CN) would need to be adjusted (Extralight master only) so that the last stroke of 良 does not touch the middle part (as circled). Screenshot 2023-01-10 at 22 27 26

墮 (U+58AE, JP) looks a bit weird on the Regular weight. Redesign the 月 so it doesn’t look like it's touching the bottom 土 component. Screenshot 2023-01-10 at 22 25 06

For 熱 (U+71B1, TW), the top right 丸 part should follow the CN form. Also attached is the Biaukai font for reference. Screenshot 2023-01-10 at 22 22 31

Missing glyphs

琶 (U+7436) is missing a CN glyph. The top left 王 part must be the same as the top right part according to 新字形, so the JP form cannot be used. Serif has the CN form, however. Addressed and expanded in #383

嚠 (U+56A0) and 赱 (U+8D71) are also missing CN glyphs. They existed in v1 and were removed in v2, so please consider restoring them if space is available. I know that I can’t list them all as there were many more removed CN glyphs to list in order to make way for HK glyphs in v2, as explained later in the final section about glyph removal.

Wrong mapping

The characters 廨 (U+5EE8) and 榧 (U+69A7) should be mapped to HK for the TW locale, since now those HK glyphs exist. Screenshot 2023-01-10 at 19 40 47

Feet removal

Continuing issue #293, for 亜 (U+4E9C), I suggest to remove the JP glyph (left) and restore the v1 CN glyph (right). Screenshot 2023-01-10 at 20 15 04 Screenshot 2023-01-10 at 20 12 04

For 憇 (U+6187), the feet in 甜 must be removed because 心 is below it, in addition, as mentioned above, 心 must be made more consistent across all regions. Although I'm not sure if we can keep the leftmost feet circled in orange. Screenshot 2023-01-10 at 20 16 17

For 患, there’s feet in JP for 串 (left), please remove that and map it to all regions, then can probably remove the CN glyph (right). Or you can use the CN glyph for all regions and remove the JP glyph, whichever is easier or more aesthetically pleasing. Screenshot 2023-01-10 at 20 18 23

Additional feet removal

Edited, see here.

Potential CN/KR glyph removal in v3 in favour of MO support?

As noted in #292, I observed that moving from v1 to v2, not only most, if not all, of the non-Adobe-Japan1-6 JP glyphs were removed, but also the CN glyphs that are basically rarely used in China but were common in some Japanese shinjitai and JIS character sets, for example, 毎, 呉, 険, 銭. This was, as I said before, to make room for a lot of HK glyphs.

In v3 (presumably), Macao support (shorthand MO) is very likely to be added, so to accommodate for such glyphs, some existing glyphs are going to be consolidated and sacrificed as noted in #345. It just depends on how many MO-only glyphs might be needed, and by right it shouldn't be much. But if it's quite a lot, I suspect that the coverage for CN and/or KR might be reduced further unless more glyphs can be merged.

If that scenario were to happen, this is what I am speculating:

For CN, the remaining CN glyphs for Japanese shinjitai, kokuji and any other kanji in the Adobe-Japan1 lists that are not commonly used in China would probably be removed in v3. That leaves the JP glyph as the sole glyph because I was thinking the Chinese would not bat an eye over this potential inconsistency as they would rarely come across them.

So basically, I'm thinking the future CN glyph coverage might possibly be reduced to only properly cover GB2312, the 通用规范汉字表 (2013) set and 现代汉语通用字表 (1988) set with guaranteed correct orthography for China, and any characters outside those lists might either be kept if there’s space, or get remapped to JP/KR/TW/HK forms wherever appropriate. However, if the CN glyph is the sole glyph for those characters outside the aforementioned lists, they will remain. In other words, the GB18030 coverage with correct CN orthography will no longer be guaranteed due to limited space and a potential further loss of CN orthography for a good amount of characters in the GB18030 set.

In addition to this, I wonder if the subset would drop from GB18030 to just GB2312 and 通用规范汉字表 (basically covering only about 95% of Simplified Chinese, and will not account for variant characters, traditional Chinese and Japanese as it does today), or GBK (which drops most of CJK Extension A basically).

Alternatively, the second worst-case scenario is that the Korean glyph coverage would be potentially reduced to cover only Level 0 Hanja (as in all the macOS Korean fonts that support Hanja), partially removing support for Level 1 Hanja that are normally in Windows Korean fonts (like Malgun Gothic), the extended Hanja that do not appear on Mac fonts or any fonts that comply with KSX 1001 standards. Any JP glyph that supports Level 1 Hanja and is part of the Adobe-Japan1-6 character set would remain, otherwise, they would get remapped to JP/CN/TW/HK forms.

But I hope these two scenarios would not happen because it would mean an even more compromised font than it already is now, and nobody would like to see reduced Chinese character glyph coverage for their respective languages. I want the opposite to happen with restored v1 CN glyphs to support GBK and GB18030 more properly, even with all the merged components, and we can still have room for MO glyphs.

Conclusion

Finally, I think it’s best that the unreleased JP (Dr. Lunde said there were about 3000 additional JP glyphs outside the Adobe-Japan1-6 character set that never made it to v1), v1 and v2 overlapping sources in UFO format be released as soon as possible, as mentioned in #292 (but at that time I didn’t know what universal font editing format was used). I still hope that they would not go to waste forever even if it takes a long time.

Marcus98T avatar Jan 10 '23 16:01 Marcus98T

I have one more thing to add regarding the feet removal: The 呉 and 吳 components should have the feet removed as circled in both JP and KR. Characters affected (highlighted in yellow): 呉, 吳, 俁, 娛, 娯, 誤 (JP and KR), 悞, 鋘, 蜈 Screenshot 2023-01-11 at 01 36 22

~Probably 吳, 俁 and 悞 would remap to CN for JP and KR, otherwise, just manually remove the feet from the JP glyphs and then remove these CN glyphs.~ EDIT: I would very much prefer that the feet for 吳, 俁 and 悞 be removed so the JP glyphs can be kept and the CN glyphs be removed. Screenshot 2023-01-11 at 01 42 21

Also, the 大 in 吳, 娛, 鋘 and 蜈 should be closed and connected with the middle component to be more consistent with the other 吳 glyphs.

Marcus98T avatar Jan 10 '23 17:01 Marcus98T

弱 (U+5F31, CN and HK only) looks unnecessarily similar. Suggest to use the HK form for CN/TW locales and remove the CN glyph.

Referring to 羽, the better way is to change TW from CN to HK.

寏 (U+5BCF JP) is a bit different (八 instead of 大 at the bottom)

Refer to Adobe Japan 1-6 CID+21445.

This issue only pertains to Source Han Sans and does not apply to Serif due to different design principles. While this may be a debatable issue, I think maybe the feet in JP and KR should be removed in radicals and components such as 女, 弓, 廴 and adopt the CN form. This way, all the other hyogaiji kanji outside of Adobe-Japan1-6 would look more consistent in comparison if this suggestion is implemented.

Strongly advocate against such design-breaking decisions. If that is the case, it should be CN that should be changing as there is no strict rules on these parts.

For CN, TW and HK, maybe I suggest to once again redesign the 辶 component (TW and HK only) and the 廴 component so the latter can be shared across all regions, as seen in the picture for the "new" 廴 radical above.

Refer to the CN glyph in the font. No redesign required. image

NightFurySL2001 avatar Jan 11 '23 00:01 NightFurySL2001

Referring to 羽, the better way is to change TW from CN to HK.

Maybe we should look into redesigning 羽 as a whole to follow TW/HK forms for JP (except for Hyogaiji), CN, TW and HK regions. Even some Japanese (in blue) and Chinese (in yellow) fonts have the 提 stroke not touching the outer component. Kozuka Gothic also has the components not touching, and Hiragino Sans GB and CNS's 羽 glyph is identical to the original JP version. Screenshot 2023-01-11 at 16 58 57

Refer to Adobe Japan 1-6 CID+21445.

Which is why I am asking if the adjustment to the middle component (such that it does not touch the bottom horizontal stroke) should be made for this character or not.

Strongly advocate against such design-breaking decisions. If that is the case, it should be CN that should be changing as there is no strict rules on these parts.

Refer to the CN glyph in the font. No redesign required. image

I do not believe that the JP and KR designs would accept this CN form in its current state. ~I was thinking, alternatively, to redesign it more like Lantinghei and Pingfang where there is a 90 degree angled corner in the middle part.~ Screenshot 2023-01-11 at 16 32 18

However, if the 廴 component cannot be merged because it will compromise the region-differentiating designs, then the 女 and 弓 components would still have to be merged. Some Japanese UD fonts (like UD Shin Go) and Hiragino Sans (with the exception of the characters on their own) also have these feet removed from the latter components as well. Screenshot 2023-01-11 at 16 37 45 Screenshot 2023-01-11 at 16 38 59

But it would be more likely that the 𠆢 roof radical and the ⺮ radical be merged first.

Marcus98T avatar Jan 11 '23 09:01 Marcus98T

I was thinking, alternatively, to redesign it more like Lantinghei and Pingfang where there is a 90 degree angled corner in the middle part.

This is still design-breaking changes. Note the top right corner has the same angled corner and the current design matches that corner. Changing both corners requires changing all 折 designs (including but not limited to 又夂夕女[on left]也矛甬糹东车经).

BTW: you had "reported" this issue before. https://github.com/adobe-fonts/source-han-sans/issues/205#issuecomment-490052476

⺮ radical be merged first.

Unless they want to stop support 國字標準字體, that is not feasible.

Most issues here are superfluous and shouldn't be attended to before Source Han fix most of the other issues. These are bad decisions yes, but they are minimal compared to other breaking issues in the repository. Please try to refrain from reposting redundant issues you had already mentioned in other issues before.

NightFurySL2001 avatar Jan 11 '23 10:01 NightFurySL2001

Then forget about 廴, I just put a strike-through on that in my posts.

BTW: you had "reported" this issue before. https://github.com/adobe-fonts/source-han-sans/issues/205#issuecomment-490052476

Thanks for the reminder, I'll end this senseless talk here unless there is a serious bug with a character, and will no longer open any new merging component issues on the Source Han Sans/Serif GitHub pages. I'll remind myself, there are way more than enough open issues to warrant their attention to review and it was already indirectly acknowledged. Hopefully they are in the midst of reviewing thousands of characters based on the existing issues we have. In the meantime, I will have to remind myself to let them decide and be more patient for results to show when v3 releases.

EDIT: What I was doing is a valid point, but I made the mistake of lumping merging design choices with stroke issues, wrong mappings, missing glyphs, etc. To avoid further annoyance, the issues stay where they are, but I will no longer post any new merging component issues.

Marcus98T avatar Jan 11 '23 10:01 Marcus98T