glyphIgo
glyphIgo copied to clipboard
Font subsetting excludes the substitution and ligature glyphs
The current font subsetting algorithm works by selecting the glyphs corresponding to given characters and removing everything else. This removes all the substitution and ligature glyphs as well. For Indic scripts (and potentially for other scripts), this leaves the font unusable.
The correct way would be to pull in for every character, all the substitution glyphs and all ligature glyphs. Out of many font subsetting python scripts I found online, only the one included in fonttools works correctly: https://github.com/behdad/fonttools/blob/master/Lib/fontTools/subset.py
Thank you for pointing this out. It is a pretty serious problem indeed.
If I remember correctly, I gave a thought on this some time ago, and the conclusion was that the fontforge lib did not provide an easy way to extract subs and ligaturates. But this might have changed in recent fontforge lib release, I need to recheck this.
Someone else pointed me to the TTX lib, but also fonttools seems pretty interesting. I will see if switching to one of these will do the job.
That is true. Fontforge library doesn't make it straightforward. It will need a 2 pass process: Go through all the glyphs and using the getPosSub method, collect all the ligatures and substitutions recursively (some of the substitutions themselves might be composed of further glyphs). A union of all the glyphs encountered in this expansion is the set that you want to keep. In the second pass, you can remove rest of them.
I have some code written that does it but I did it on top of another subsetting script I found. If I find time, I will port it to glyphigo and send a pull request. In any case, thanks for the useful library. I'm also dealing with fonts in context of e-books.
@abhaga --- apologies, I confused you with behdad, please ignore the previous comment.
:)
I asked behdad to update the fonttools package in PyPI, so that it can be used instead of fontforge
here. See https://github.com/behdad/fonttools/issues/140#issuecomment-109742193
While waiting for a reply there ( https://github.com/behdad/fonttools/issues/140#issuecomment-109742193 ), I made a test by "adapting" glyphIgo to use subset.py
from fonttools
, and it seems to work well. As soon as it goes on PyPI, I will switch to it in glyphIgo.
Hi folks!
Thanks for making this, it's an incredibly useful tool!
Would this issue have any effect on being able to extract small capitals from a font file?
If I add the following to list.txt
and run the subset command, the Capitals, and lowercase come out, but the small-caps do not:
Aᴀa
Bʙb
Cᴄc
I am planning to update glyphIgo to use the shiny new fonttools next week or the following one. While doing that, I will think about this issue. Thank you for reporting it.
Just for reference, in case someone needs this feature, https://github.com/filamentgroup/glyphhanger seems to keep ligatures (I did not verify if other advance font features too).