Player icon indicating copy to clipboard operation
Player copied to clipboard

Missing 깄 glyph (U+AE44 HANGUL SYLLABLE GISS)

Open fdelapena opened this issue 9 years ago • 8 comments

This glyph seems to be missing in our current used Baekmuk Gulim based bitmapped font, according to logs.

fdelapena avatar Mar 25 '16 04:03 fdelapena

For Chinese glyphs there is a specific issue (#260).

Because the China standard is ~16x16 and Player uses a 12x12 font, this bitmap scaling would be ugly. It might be better a font rasterization from a free scalable source instead, as there are no 12x12 fonts around with a proper license.

fdelapena avatar Jul 20 '16 17:07 fdelapena

https://sourceforge.net/projects/wqy/files/wqy-bitmapfont/1.0.0-RC1/ wqy-bitmapsong-bdf-1.0.0-RC1.tar.gz wenquanyi_9pt.bdf

Here, 9pt, 12x12 font. License: GPLv2 with font embedding exception.

I work for this, with an ugly method (font.cpp). It worked fine with simsun font, but simsun font has a license problem. now, I start to write a python script to generate cpp code.

sorry for my poor english.

DBLobster avatar Oct 11 '16 03:10 DBLobster

I saw in a Korean Wiki that somebody says that Korean glyphs outside of "KS X 1001" are missing. Which gave a really useful hint.

KS X 1001 only provides 2,350 precompiled Glyphs. The Unicode block for Hangul Syllables contains 11,184. [note: you can decompose a precomposed glyph and use the Hangul Jamo block [255 glyphs] [https://en.wikipedia.org/wiki/Hangul_Jamo_(Unicode_block)] to render it with harfbuzz] :D

https://en.wikipedia.org/wiki/Hangul_Jamo_(Unicode_block)

https://en.wikipedia.org/wiki/Hangul_Syllables

Ghabry avatar Feb 10 '17 11:02 Ghabry

Though we haven't seen complaints from Korean users since #1058 was merged in 2016, it seems using Jamo with HarfBuzz is the best way to fix this because it will allow to save size by stripping the precomposed hangul glyphs. Not sure if closing this issue is OK and adding a comment to the harfbuzz issue is enough. For completeness, here's the page with documents with the mapping tables to see which characters are missing from KS X 1001 compared to UCS: http://asadal.pusan.ac.kr/~gimgs0/hangeul/code/hcode.html

fdelapena avatar May 06 '21 06:05 fdelapena

A note for later: The bigger fonts (shinonome must be splitted by region) could be also stored deflated. This increases the memory usage during runtime, but a game usually doesn't mix C, J and K (+precomposed K), so this would still safe memory.

Ghabry avatar Jul 14 '21 09:07 Ghabry

The algorithm for decomposing to the Jamo block is actually super simple:

codepoint must be in [0xAC00 - D7AE]

first = codepoint - 0xAC00
first_div = codepoint / 28
first_mod = codepoint % 28

if first_mod = 0 -> No "T" Jamo
else first_mod + 0x11A7 -> "T" Jamo

second_div = first_div / 21
second_mod = first_div % 21

second_div + 0x1100 -> "L" Jamo
second_mod + 0x1161 -> "V" Jamo

Edit: This won't work. The Jamo block must support shaping because the size differs depending on what is rendered. So not an option to simply copy pixels ^^'

Ghabry avatar Jul 14 '21 10:07 Ghabry

By now I believe that at least for our 12x12 font machine generating most missing glyphs will work. For larger fonts this is tricker because the position of the T-Jamo often depends on who is above. For 12x12 it is not possible to position it properly, so not much to do wrong here 👍 .

Main problem is that there are also entire blocks (no L-V and not a single L-V-T) missing. Here one has to draw at least 2 of them manually.

Well and after the machine generating this also needs a manual review to see which glyphs need some tuning.

But in the end this looks feasible for a "longer weekend" task.

Ghabry avatar Jul 14 '21 20:07 Ghabry

The character ㌧ (U+3327 SQUARE TON, decimal: 13095, HTML: ㌧, UTF-8: 0xE3 0x8C 0xA7, script: Katakana, block: CJK Compatibility, decomposition: U+30C8 U+30F3) is also missing from the internal font, it is used for example in 子供たちの国-Magic Children-. image

Mimigris avatar Sep 24 '23 09:09 Mimigris