[tx] subsetting CIDFont with `-t1 -decid -g '/0'` option fails
My colleague experienced the issue that the recent versions of tx could not handle the following cases:
# download a CID-Keyed OpenType/CFF font and see what's included in each FD
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ fdarray-check.pl SourceHanSans-Regular.otf | grep Generic
SourceHanSans-Regular-Generic (5): /0,/1078-/1237,/1286,/1293,/62067,/65531-/65534 (168)
# a. fail: explicitly pass GID+0 to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -dump SourceHanSans-Regular.cid | grep sup.srcFontType
sup.srcFontType Type 1 (cid-keyed)
$ tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
# b. fail: pass GID+0 as last argument to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -dump SourceHanSans-Regular.cid | grep sup.srcFontType
sup.srcFontType Type 1 (cid-keyed)
$ tx -t1 -decid -g '/1286,/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
When a CIDFont is given as a source font, making a Type 1 subset which contains a .notdef as a sole glyph in a target font fails, and the unterminated charstring error happens. However, the following tests succeed:
# c. pass: directly converting from OTF to T1
$ tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.otf
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}
# d. pass: give empty string to -g option (tx inserts '.notdef' as part of postprocessing)
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}
# e. pass: omit CID+0 from -g option (this time tx appends '.notdef' at the end)
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '/1286' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {cid1286,-,1}
glyph[1] {.notdef,-,1}
# f. pass: give both CID+0 and CID+1286 to -g option
$ tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ tx -t1 -decid -g '/0,/1286' -o notdef.pfa SourceHanSans-Regular.cid
$ tx -dump notdef.pfa | grep '^glyph'
glyph[0] {.notdef,-,1}
glyph[1] {cid1286,-,1}
It looks like it fails when saveCstr() in t1read.c accidentally deciphers the charstring for .notdef twice. Considering the fact that the case b. fails while f. succeeds, there is most likely a bug when handling the already-deciphered portion in the charstring buffer.
Instead of $ tx -t1 -decid -g '/0' -o notdef.pfa font.cid, using the case d. ($ tx -t1 -decid -g '' -o notdef.pfa font.cid) as a workaround successfully generates a Type 1 font which contains only the .notdef glyph from the source CIDFont, but it looks like a accidental solution.
Not sure how to tackle this issue, but I wish it'll get addressed as this CIDFont -> T1 conversion path is valuable when patching an existing font.
Thank you for reporting this! To clarify, is this occurring with AFDKO version 3.9.2? If so, are you able to check if this issue is also happening in the previous AFDKO version 3.9.1?
is this occurring with AFDKO version 3.9.2?
Yes.
$ mkdir -p bin/3.9.2 bin/3.9.1 bin/2.9.1 bin/2.7.0 bin/2.5.65781 bin/2.5.65322
$ curl -s 'https://files.pythonhosted.org/packages/fb/72/656175cf07ffa084135d3b49b275d53a008a834e207eff5fac2308196012/afdko-3.9.2-py3-none-macosx_10_9_x86_64.whl' | tar -xO afdko-3.9.2.data/scripts/tx > bin/3.9.2/tx
$ curl -s 'https://files.pythonhosted.org/packages/4e/ba/4f75ac83817cef360cd70265cb02bd9523cc17951946ecc26574976c9b7b/afdko-3.9.1-py3-none-macosx_10_9_x86_64.whl' | tar -xO afdko-3.9.1.data/scripts/tx > bin/3.9.1/tx
$ curl -s 'https://files.pythonhosted.org/packages/32/33/c769ff75e47ff1e09b7a5a0fe561cb401a30d7a7dacd0e4f507c700a7aa0/afdko-2.9.1-py2.py3-none-macosx_10_6_intel.whl' | tar -xO afdko-2.9.1.data/scripts/tx > bin/2.9.1/tx
$ curl -s 'https://files.pythonhosted.org/packages/3d/8c/be1892b9210d885a844b9fabf6d4f0575d8bf79e122608f1486e10c6bccb/afdko-2.7.0-cp27-cp27m-macosx_10_10_intel.whl' | tar -xO afdko-2.7.0.data/scripts/tx > bin/2.7.0/tx
$ curl -sL 'https://web.archive.org/web/20180708155416/http://download.macromedia.com/pub/developer/opentype/FDK.2.5.65781/FDK.2.5.65781-MAC.zip' | tar -xO FDK/Tools/osx/tx > bin/2.5.65781/tx
$ curl -sL 'https://github.com/adobe-type-tools/afdko/releases/download/2.5.65322/FDK-25-MAC.b65322.zip' | tar -xO FDK-25-MAC.b65322/Tools/osx/tx > bin/2.5.65322/tx
$ find bin -type f -name 'tx' -exec chmod +x '{}' \;
$ bin/3.9.2/tx -v | grep tx
tx 1.3.0
$ bin/3.9.1/tx -v | grep tx
tx 1.3.0
$ bin/2.9.1/tx -v | grep tx
tx 1.2.1
$ bin/2.7.0/tx -v | grep tx
tx 1.0.72
$ bin/2.5.65781/tx -v | grep tx
tx 1.0.70
$ bin/2.5.65322/tx -v | grep tx
tx 1.0.66
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ bin/3.9.2/tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ bin/3.9.2/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/3.9.1/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.9.1/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.5.65781/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
$ bin/2.5.65322/tx -t1 -decid -g '/0' -o notdef.pfa SourceHanSans-Regular.cid
tx: --- SourceHanSans-Regular.cid
tx: (t1r) unterminated charstring CID-0
Contrary to my first expectation, none of the tx versions above (3.9.2, 3.9.1, 2.9.1 and even 2.5.65781 or 2.5.65322) worked. Tested on macOS 10.15.7 (x86_64). It seems like a long standing one, but thank you for the response anyway!
Sorry for missing the point in the previous post. I finally found a difference between AFDKO 2.7.2 and AFDKO 2.8.0.
Both version emit the same unterminated charstring CID-0 error to the terminal, but still the former generates a Type 1 font somehow while the latter produces nothing. Again, sorry for messing up the thread, but see the following test result:
$ mkdir -p bin/3.9.2 bin/3.9.1 bin/2.9.1 bin/2.8.0 bin/2.7.2 bin/2.7.0 bin/2.5.65781 bin/2.5.65322
$ curl -s 'https://files.pythonhosted.org/packages/fb/72/656175cf07ffa084135d3b49b275d53a008a834e207eff5fac2308196012/afdko-3.9.2-py3-none-macosx_10_9_x86_64.whl' | tar -xO afdko-3.9.2.data/scripts/tx > bin/3.9.2/tx
$ curl -s 'https://files.pythonhosted.org/packages/4e/ba/4f75ac83817cef360cd70265cb02bd9523cc17951946ecc26574976c9b7b/afdko-3.9.1-py3-none-macosx_10_9_x86_64.whl' | tar -xO afdko-3.9.1.data/scripts/tx > bin/3.9.1/tx
$ curl -s 'https://files.pythonhosted.org/packages/32/33/c769ff75e47ff1e09b7a5a0fe561cb401a30d7a7dacd0e4f507c700a7aa0/afdko-2.9.1-py2.py3-none-macosx_10_6_intel.whl' | tar -xO afdko-2.9.1.data/scripts/tx > bin/2.9.1/tx
$ curl -s 'https://files.pythonhosted.org/packages/f2/6a/0a70124ab1848c2dfbb7a19263bd9146317c8a31cea03570f22f907a779e/afdko-2.8.0-cp27-cp27m-macosx_10_6_intel.whl' | tar -xO afdko-2.8.0.data/scripts/tx > bin/2.8.0/tx
$ curl -s 'https://files.pythonhosted.org/packages/23/a7/034dea97909ba741e842730de3fedf79cd06495766f1e1944fdbbe33cca5/afdko-2.7.2-cp27-cp27m-macosx_10_10_intel.whl' | tar -xO afdko-2.7.2.data/scripts/tx > bin/2.7.2/tx
$ curl -s 'https://files.pythonhosted.org/packages/3d/8c/be1892b9210d885a844b9fabf6d4f0575d8bf79e122608f1486e10c6bccb/afdko-2.7.0-cp27-cp27m-macosx_10_10_intel.whl' | tar -xO afdko-2.7.0.data/scripts/tx > bin/2.7.0/tx
$ curl -sL 'https://web.archive.org/web/20180708155416/http://download.macromedia.com/pub/developer/opentype/FDK.2.5.65781/FDK.2.5.65781-MAC.zip' | tar -xO FDK/Tools/osx/tx > bin/2.5.65781/tx
$ curl -sL 'https://github.com/adobe-type-tools/afdko/releases/download/2.5.65322/FDK-25-MAC.b65322.zip' | tar -xO FDK-25-MAC.b65322/Tools/osx/tx > bin/2.5.65322/tx
$ find bin -type f -name 'tx' -exec chmod +x '{}' \;
$ bin/3.9.2/tx -v | grep tx
tx 1.3.0
$ bin/3.9.1/tx -v | grep tx
tx 1.3.0
$ bin/2.9.1/tx -v | grep tx
tx 1.2.1
$ bin/2.8.0/tx -v | grep tx
tx 1.2.0
$ bin/2.7.2/tx -v | grep tx
tx 1.1.0
$ bin/2.7.0/tx -v | grep tx
tx 1.0.72
$ bin/2.5.65781/tx -v | grep tx
tx 1.0.70
$ bin/2.5.65322/tx -v | grep tx
tx 1.0.66
$ curl -LO 'https://github.com/adobe-fonts/source-han-sans/raw/2.004R/OTF/Japanese/SourceHanSans-Regular.otf'
$ bin/3.9.2/tx -t1 SourceHanSans-Regular.otf > SourceHanSans-Regular.cid
$ bin/3.9.2/tx -t1 -decid -g '' -o 01_notdef-3.9.2.pfa SourceHanSans-Regular.cid
$ bin/3.9.1/tx -t1 -decid -g '' -o 01_notdef-3.9.1.pfa SourceHanSans-Regular.cid
$ bin/2.9.1/tx -t1 -decid -g '' -o 01_notdef-2.9.1.pfa SourceHanSans-Regular.cid
$ bin/2.8.0/tx -t1 -decid -g '' -o 01_notdef-2.8.0.pfa SourceHanSans-Regular.cid
$ bin/2.7.2/tx -t1 -decid -g '' -o 01_notdef-2.7.2.pfa SourceHanSans-Regular.cid
$ bin/2.7.0/tx -t1 -decid -g '' -o 01_notdef-2.7.0.pfa SourceHanSans-Regular.cid
$ bin/2.5.65781/tx -t1 -decid -g '' -o 01_notdef-2.5.65781.pfa SourceHanSans-Regular.cid
$ bin/2.5.65322/tx -t1 -decid -g '' -o 01_notdef-2.5.65322.pfa SourceHanSans-Regular.cid
$ bin/3.9.2/tx -t1 -decid -g '/0' -o 02_notdef-3.9.2.pfa SourceHanSans-Regular.cid
$ bin/3.9.1/tx -t1 -decid -g '/0' -o 02_notdef-3.9.1.pfa SourceHanSans-Regular.cid
$ bin/2.9.1/tx -t1 -decid -g '/0' -o 02_notdef-2.9.1.pfa SourceHanSans-Regular.cid
$ bin/2.8.0/tx -t1 -decid -g '/0' -o 02_notdef-2.8.0.pfa SourceHanSans-Regular.cid
$ bin/2.7.2/tx -t1 -decid -g '/0' -o 02_notdef-2.7.2.pfa SourceHanSans-Regular.cid
$ bin/2.7.0/tx -t1 -decid -g '/0' -o 02_notdef-2.7.0.pfa SourceHanSans-Regular.cid
$ bin/2.5.65781/tx -t1 -decid -g '/0' -o 02_notdef-2.5.65781.pfa SourceHanSans-Regular.cid
$ bin/2.5.65322/tx -t1 -decid -g '/0' -o 02_notdef-2.5.65322.pfa SourceHanSans-Regular.cid
$ ls -l *_notdef-*.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 01_notdef-2.5.65322.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 01_notdef-2.5.65781.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 01_notdef-2.7.0.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 01_notdef-2.7.2.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 01_notdef-2.8.0.pfa
-rw-r--r-- 1 tfuji staff 3133 Jan 19 11:11 01_notdef-2.9.1.pfa
-rw-r--r-- 1 tfuji staff 3133 Jan 19 11:11 01_notdef-3.9.1.pfa
-rw-r--r-- 1 tfuji staff 3133 Jan 19 11:11 01_notdef-3.9.2.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 02_notdef-2.5.65322.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 02_notdef-2.5.65781.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 02_notdef-2.7.0.pfa
-rw-r--r-- 1 tfuji staff 3138 Jan 19 11:11 02_notdef-2.7.2.pfa
The point is that here we don't see any of 02_notdef-3.9.2.pfa, 02_notdef-3.9.1.pfa, 02_notdef-2.9.1.pfa or 02_notdef-2.8.0.pfa, which leads us to invent the -g '' workaround for this particular use case for later versions of AFDKO.
$ md5sum *_notdef-*.pfa
54f2b084e980942ecb364fc889f478fe 01_notdef-2.5.65322.pfa
54f2b084e980942ecb364fc889f478fe 01_notdef-2.5.65781.pfa
54f2b084e980942ecb364fc889f478fe 01_notdef-2.7.0.pfa
54f2b084e980942ecb364fc889f478fe 01_notdef-2.7.2.pfa
54f2b084e980942ecb364fc889f478fe 01_notdef-2.8.0.pfa
208ba70568e2664309656019c79c2fa2 01_notdef-2.9.1.pfa
9fa80762686074b912c83ec652d439aa 01_notdef-3.9.1.pfa
9fa80762686074b912c83ec652d439aa 01_notdef-3.9.2.pfa
54f2b084e980942ecb364fc889f478fe 02_notdef-2.5.65322.pfa
54f2b084e980942ecb364fc889f478fe 02_notdef-2.5.65781.pfa
54f2b084e980942ecb364fc889f478fe 02_notdef-2.7.0.pfa
54f2b084e980942ecb364fc889f478fe 02_notdef-2.7.2.pfa
$ tx -dump -3 01_notdef-2.8.0.pfa | grep -v '## Filename' > 01_notdef-2.8.0.txt
$ tx -dump -3 01_notdef-2.9.1.pfa | grep -v '## Filename' > 01_notdef-2.9.1.txt
$ tx -dump -3 01_notdef-3.9.1.pfa | grep -v '## Filename' > 01_notdef-3.9.1.txt
$ md5sum 01_notdef-*.txt
751b980976095b29b1bf84a0da353503 01_notdef-2.8.0.txt
d7eaba1e4452c0a6aa3e2ccca39f7811 01_notdef-2.9.1.txt
d7eaba1e4452c0a6aa3e2ccca39f7811 01_notdef-3.9.1.txt
$ diff -u 01_notdef-2.8.0.txt 01_notdef-2.9.1.txt
--- 01_notdef-2.8.0.txt 2023-01-19 11:19:39.000000000 +0900
+++ 01_notdef-2.9.1.txt 2023-01-19 11:19:39.000000000 +0900
@@ -4,7 +4,7 @@
FamilyName "Source Han Sans"
Weight "Regular"
UnderlinePosition -150
-FontBBox {-1002,-1048,2928,1808}
+FontBBox {100,-120,900,880}
FSType 0
sup.srcFontType Type 1 (name-keyed)
sup.nGlyphs 1
The PFA diffs between versions roughly come from the bbox recalculation and both -g '/0' and -g '' give the same output in older tx versions. Hope it helps!