afdko icon indicating copy to clipboard operation
afdko copied to clipboard

[tx] subsetting CIDFont with `-t1 -decid -g '/0'` option fails

Open takaakifuji opened this issue 2 years ago • 3 comments

My colleague experienced the issue that the recent versions of tx could not handle the following cases:

# download a CID-Keyed OpenType/CFF font and see what's included in each FD
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ fdarray-check.pl SourceHanSans-Regular.otf | grep Generic
SourceHanSans-Regular-Generic (5): /0,/1078-/1237,/1286,/1293,/62067,/65531-/65534 (168)
# a. fail: explicitly pass GID+0 to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -dump SourceHanSans-Regular.cid | grep sup.srcFontType
sup.srcFontType     Type 1 (cid-keyed)
$ tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0

# b. fail: pass GID+0 as last argument to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -dump SourceHanSans-Regular.cid | grep sup.srcFontType
sup.srcFontType     Type 1 (cid-keyed)
$ tx -t1 -decid -g '/1286,/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0

When a CIDFont is given as a source font, making a Type 1 subset which contains a .notdef as a sole glyph in a target font fails, and the unterminated charstring error happens. However, the following tests succeed:

# c. pass: directly converting from OTF to T1
$ tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.otf
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}

# d. pass: give empty string to -g option (tx inserts '.notdef' as part of postprocessing)
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}

# e. pass: omit CID+0 from -g option (this time tx appends '.notdef' at the end)
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '/1286' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {cid1286,-,1}
glyph[1] {.notdef,-,1}

# f. pass: give both CID+0 and CID+1286 to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '/0,/1286' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}
glyph[1] {cid1286,-,1}

It looks like it fails when saveCstr() in t1read.c accidentally deciphers the charstring for .notdef twice. Considering the fact that the case b. fails while f. succeeds, there is most likely a bug when handling the already-deciphered portion in the charstring buffer.

Instead of $ tx -t1 -decid -g '/0' -o notdef.pfa font.cid, using the case d. ($ tx -t1 -decid -g '' -o notdef.pfa font.cid) as a workaround successfully generates a Type 1 font which contains only the .notdef glyph from the source CIDFont, but it looks like a accidental solution.

Not sure how to tackle this issue, but I wish it'll get addressed as this CIDFont -> T1 conversion path is valuable when patching an existing font.

takaakifuji avatar Jan 18 '23 10:01 takaakifuji

Thank you for reporting this! To clarify, is this occurring with AFDKO version 3.9.2? If so, are you able to check if this issue is also happening in the previous AFDKO version 3.9.1?

kaydeearts avatar Jan 18 '23 18:01 kaydeearts

is this occurring with AFDKO version 3.9.2?

Yes.

$ mkdir -p bin/3.9.2 bin/3.9.1 bin/2.9.1 bin/2.7.0 bin/2.5.65781 bin/2.5.65322
$ curl -s  'https://files.pythonhosted.org/packages/fb/72/656175cf07ffa084135d3b49b275d53a008a834e207eff5fac2308196012/afdko-3.9.2-py3-none-macosx_10_9_x86_64.whl'    | tar -xO afdko-3.9.2.data/scripts/tx    > bin/3.9.2/tx
$ curl -s  'https://files.pythonhosted.org/packages/4e/ba/4f75ac83817cef360cd70265cb02bd9523cc17951946ecc26574976c9b7b/afdko-3.9.1-py3-none-macosx_10_9_x86_64.whl'    | tar -xO afdko-3.9.1.data/scripts/tx    > bin/3.9.1/tx
$ curl -s  'https://files.pythonhosted.org/packages/32/33/c769ff75e47ff1e09b7a5a0fe561cb401a30d7a7dacd0e4f507c700a7aa0/afdko-2.9.1-py2.py3-none-macosx_10_6_intel.whl' | tar -xO afdko-2.9.1.data/scripts/tx    > bin/2.9.1/tx
$ curl -s  'https://files.pythonhosted.org/packages/3d/8c/be1892b9210d885a844b9fabf6d4f0575d8bf79e122608f1486e10c6bccb/afdko-2.7.0-cp27-cp27m-macosx_10_10_intel.whl'  | tar -xO afdko-2.7.0.data/scripts/tx    > bin/2.7.0/tx
$ curl -sL 'https://web.archive.org/web/20180708155416/http://download.macromedia.com/pub/developer/opentype/FDK.2.5.65781/FDK.2.5.65781-MAC.zip'                      | tar -xO FDK/Tools/osx/tx               > bin/2.5.65781/tx
$ curl -sL 'https://github.com/adobe-type-tools/afdko/releases/download/2.5.65322/FDK-25-MAC.b65322.zip'                                                               | tar -xO FDK-25-MAC.b65322/Tools/osx/tx > bin/2.5.65322/tx
$ find bin -type f -name 'tx' -exec chmod +x '{}' \;
$ bin/3.9.2/tx -v | grep tx
    tx        1.3.0
$ bin/3.9.1/tx -v | grep tx
    tx        1.3.0
$ bin/2.9.1/tx -v | grep tx
    tx        1.2.1
$ bin/2.7.0/tx -v | grep tx
    tx        1.0.72
$ bin/2.5.65781/tx -v | grep tx
    tx        1.0.70
$ bin/2.5.65322/tx -v | grep tx
    tx        1.0.66
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ bin/3.9.2/tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ bin/3.9.2/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/3.9.1/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.9.1/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.5.65781/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.5.65322/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0

Contrary to my first expectation, none of the tx versions above (3.9.2, 3.9.1, 2.9.1 and even 2.5.65781 or 2.5.65322) worked. Tested on macOS 10.15.7 (x86_64). It seems like a long standing one, but thank you for the response anyway!

takaakifuji avatar Jan 18 '23 23:01 takaakifuji

Sorry for missing the point in the previous post. I finally found a difference between AFDKO 2.7.2 and AFDKO 2.8.0.

Both version emit the same unterminated charstring CID-0 error to the terminal, but still the former generates a Type 1 font somehow while the latter produces nothing. Again, sorry for messing up the thread, but see the following test result:

$ mkdir -p bin/3.9.2 bin/3.9.1 bin/2.9.1 bin/2.8.0 bin/2.7.2 bin/2.7.0 bin/2.5.65781 bin/2.5.65322
$ curl -s  'https://files.pythonhosted.org/packages/fb/72/656175cf07ffa084135d3b49b275d53a008a834e207eff5fac2308196012/afdko-3.9.2-py3-none-macosx_10_9_x86_64.whl'    | tar -xO afdko-3.9.2.data/scripts/tx    > bin/3.9.2/tx
$ curl -s  'https://files.pythonhosted.org/packages/4e/ba/4f75ac83817cef360cd70265cb02bd9523cc17951946ecc26574976c9b7b/afdko-3.9.1-py3-none-macosx_10_9_x86_64.whl'    | tar -xO afdko-3.9.1.data/scripts/tx    > bin/3.9.1/tx
$ curl -s  'https://files.pythonhosted.org/packages/32/33/c769ff75e47ff1e09b7a5a0fe561cb401a30d7a7dacd0e4f507c700a7aa0/afdko-2.9.1-py2.py3-none-macosx_10_6_intel.whl' | tar -xO afdko-2.9.1.data/scripts/tx    > bin/2.9.1/tx
$ curl -s  'https://files.pythonhosted.org/packages/f2/6a/0a70124ab1848c2dfbb7a19263bd9146317c8a31cea03570f22f907a779e/afdko-2.8.0-cp27-cp27m-macosx_10_6_intel.whl'   | tar -xO afdko-2.8.0.data/scripts/tx    > bin/2.8.0/tx
$ curl -s  'https://files.pythonhosted.org/packages/23/a7/034dea97909ba741e842730de3fedf79cd06495766f1e1944fdbbe33cca5/afdko-2.7.2-cp27-cp27m-macosx_10_10_intel.whl'  | tar -xO afdko-2.7.2.data/scripts/tx    > bin/2.7.2/tx
$ curl -s  'https://files.pythonhosted.org/packages/3d/8c/be1892b9210d885a844b9fabf6d4f0575d8bf79e122608f1486e10c6bccb/afdko-2.7.0-cp27-cp27m-macosx_10_10_intel.whl'  | tar -xO afdko-2.7.0.data/scripts/tx    > bin/2.7.0/tx
$ curl -sL 'https://web.archive.org/web/20180708155416/http://download.macromedia.com/pub/developer/opentype/FDK.2.5.65781/FDK.2.5.65781-MAC.zip'                      | tar -xO FDK/Tools/osx/tx               > bin/2.5.65781/tx
$ curl -sL 'https://github.com/adobe-type-tools/afdko/releases/download/2.5.65322/FDK-25-MAC.b65322.zip'                                                               | tar -xO FDK-25-MAC.b65322/Tools/osx/tx > bin/2.5.65322/tx
$ find bin -type f -name 'tx' -exec chmod +x '{}' \;
$ bin/3.9.2/tx -v | grep tx
    tx        1.3.0
$ bin/3.9.1/tx -v | grep tx
    tx        1.3.0
$ bin/2.9.1/tx -v | grep tx
    tx        1.2.1
$ bin/2.8.0/tx -v | grep tx
    tx        1.2.0
$ bin/2.7.2/tx -v | grep tx
    tx        1.1.0
$ bin/2.7.0/tx -v | grep tx
    tx        1.0.72
$ bin/2.5.65781/tx -v | grep tx
    tx        1.0.70
$ bin/2.5.65322/tx -v | grep tx
    tx        1.0.66
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ bin/3.9.2/tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ bin/3.9.2/tx     -t1 -decid -g ''   -o 01_notdef-3.9.2.pfa     SourceHanSans-Regular.cid
$ bin/3.9.1/tx     -t1 -decid -g ''   -o 01_notdef-3.9.1.pfa     SourceHanSans-Regular.cid
$ bin/2.9.1/tx     -t1 -decid -g ''   -o 01_notdef-2.9.1.pfa     SourceHanSans-Regular.cid
$ bin/2.8.0/tx     -t1 -decid -g ''   -o 01_notdef-2.8.0.pfa     SourceHanSans-Regular.cid
$ bin/2.7.2/tx     -t1 -decid -g ''   -o 01_notdef-2.7.2.pfa     SourceHanSans-Regular.cid
$ bin/2.7.0/tx     -t1 -decid -g ''   -o 01_notdef-2.7.0.pfa     SourceHanSans-Regular.cid
$ bin/2.5.65781/tx -t1 -decid -g ''   -o 01_notdef-2.5.65781.pfa SourceHanSans-Regular.cid
$ bin/2.5.65322/tx -t1 -decid -g ''   -o 01_notdef-2.5.65322.pfa SourceHanSans-Regular.cid
$ bin/3.9.2/tx     -t1 -decid -g '/0' -o 02_notdef-3.9.2.pfa     SourceHanSans-Regular.cid
$ bin/3.9.1/tx     -t1 -decid -g '/0' -o 02_notdef-3.9.1.pfa     SourceHanSans-Regular.cid
$ bin/2.9.1/tx     -t1 -decid -g '/0' -o 02_notdef-2.9.1.pfa     SourceHanSans-Regular.cid
$ bin/2.8.0/tx     -t1 -decid -g '/0' -o 02_notdef-2.8.0.pfa     SourceHanSans-Regular.cid
$ bin/2.7.2/tx     -t1 -decid -g '/0' -o 02_notdef-2.7.2.pfa     SourceHanSans-Regular.cid
$ bin/2.7.0/tx     -t1 -decid -g '/0' -o 02_notdef-2.7.0.pfa     SourceHanSans-Regular.cid
$ bin/2.5.65781/tx -t1 -decid -g '/0' -o 02_notdef-2.5.65781.pfa SourceHanSans-Regular.cid
$ bin/2.5.65322/tx -t1 -decid -g '/0' -o 02_notdef-2.5.65322.pfa SourceHanSans-Regular.cid
$ ls -l *_notdef-*.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 01_notdef-2.5.65322.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 01_notdef-2.5.65781.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 01_notdef-2.7.0.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 01_notdef-2.7.2.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 01_notdef-2.8.0.pfa
-rw-r--r--  1 tfuji  staff  3133 Jan 19 11:11 01_notdef-2.9.1.pfa
-rw-r--r--  1 tfuji  staff  3133 Jan 19 11:11 01_notdef-3.9.1.pfa
-rw-r--r--  1 tfuji  staff  3133 Jan 19 11:11 01_notdef-3.9.2.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 02_notdef-2.5.65322.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 02_notdef-2.5.65781.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 02_notdef-2.7.0.pfa
-rw-r--r--  1 tfuji  staff  3138 Jan 19 11:11 02_notdef-2.7.2.pfa

The point is that here we don't see any of 02_notdef-3.9.2.pfa, 02_notdef-3.9.1.pfa, 02_notdef-2.9.1.pfa or 02_notdef-2.8.0.pfa, which leads us to invent the -g '' workaround for this particular use case for later versions of AFDKO.

$ md5sum *_notdef-*.pfa
54f2b084e980942ecb364fc889f478fe  01_notdef-2.5.65322.pfa
54f2b084e980942ecb364fc889f478fe  01_notdef-2.5.65781.pfa
54f2b084e980942ecb364fc889f478fe  01_notdef-2.7.0.pfa
54f2b084e980942ecb364fc889f478fe  01_notdef-2.7.2.pfa
54f2b084e980942ecb364fc889f478fe  01_notdef-2.8.0.pfa
208ba70568e2664309656019c79c2fa2  01_notdef-2.9.1.pfa
9fa80762686074b912c83ec652d439aa  01_notdef-3.9.1.pfa
9fa80762686074b912c83ec652d439aa  01_notdef-3.9.2.pfa
54f2b084e980942ecb364fc889f478fe  02_notdef-2.5.65322.pfa
54f2b084e980942ecb364fc889f478fe  02_notdef-2.5.65781.pfa
54f2b084e980942ecb364fc889f478fe  02_notdef-2.7.0.pfa
54f2b084e980942ecb364fc889f478fe  02_notdef-2.7.2.pfa

$ tx -dump -3 01_notdef-2.8.0.pfa | grep -v '## Filename' > 01_notdef-2.8.0.txt
$ tx -dump -3 01_notdef-2.9.1.pfa | grep -v '## Filename' > 01_notdef-2.9.1.txt
$ tx -dump -3 01_notdef-3.9.1.pfa | grep -v '## Filename' > 01_notdef-3.9.1.txt

$ md5sum 01_notdef-*.txt
751b980976095b29b1bf84a0da353503  01_notdef-2.8.0.txt
d7eaba1e4452c0a6aa3e2ccca39f7811  01_notdef-2.9.1.txt
d7eaba1e4452c0a6aa3e2ccca39f7811  01_notdef-3.9.1.txt

$ diff -u 01_notdef-2.8.0.txt 01_notdef-2.9.1.txt
--- 01_notdef-2.8.0.txt	2023-01-19 11:19:39.000000000 +0900
+++ 01_notdef-2.9.1.txt	2023-01-19 11:19:39.000000000 +0900
@@ -4,7 +4,7 @@
 FamilyName          "Source Han Sans"
 Weight              "Regular"
 UnderlinePosition   -150
-FontBBox            {-1002,-1048,2928,1808}
+FontBBox            {100,-120,900,880}
 FSType              0
 sup.srcFontType     Type 1 (name-keyed)
 sup.nGlyphs         1

The PFA diffs between versions roughly come from the bbox recalculation and both -g '/0' and -g '' give the same output in older tx versions. Hope it helps!

takaakifuji avatar Jan 19 '23 02:01 takaakifuji