glyphIgo Font subsetting excludes the substitution and ligature glyphs

The current font subsetting algorithm works by selecting the glyphs corresponding to given characters and removing everything else. This removes all the substitution and ligature glyphs as well. For Indic scripts (and potentially for other scripts), this leaves the font unusable.

The correct way would be to pull in for every character, all the substitution glyphs and all ligature glyphs. Out of many font subsetting python scripts I found online, only the one included in fonttools works correctly: https://github.com/behdad/fonttools/blob/master/Lib/fontTools/subset.py

Mar 20 '15 09:03 abhaga

Thank you for pointing this out. It is a pretty serious problem indeed.

If I remember correctly, I gave a thought on this some time ago, and the conclusion was that the fontforge lib did not provide an easy way to extract subs and ligaturates. But this might have changed in recent fontforge lib release, I need to recheck this.

Someone else pointed me to the TTX lib, but also fonttools seems pretty interesting. I will see if switching to one of these will do the job.

Mar 23 '15 21:03 pettarin

That is true. Fontforge library doesn't make it straightforward. It will need a 2 pass process: Go through all the glyphs and using the getPosSub method, collect all the ligatures and substitutions recursively (some of the substitutions themselves might be composed of further glyphs). A union of all the glyphs encountered in this expansion is the set that you want to keep. In the second pass, you can remove rest of them.

I have some code written that does it but I did it on top of another subsetting script I found. If I find time, I will port it to glyphigo and send a pull request. In any case, thanks for the useful library. I'm also dealing with fonts in context of e-books.

Mar 24 '15 01:03 abhaga

@abhaga --- apologies, I confused you with behdad, please ignore the previous comment.

Jun 07 '15 09:06 pettarin

:)

Jun 08 '15 04:06 abhaga

I asked behdad to update the fonttools package in PyPI, so that it can be used instead of fontforge here. See https://github.com/behdad/fonttools/issues/140#issuecomment-109742193

Jun 09 '15 11:06 pettarin

While waiting for a reply there ( https://github.com/behdad/fonttools/issues/140#issuecomment-109742193 ), I made a test by "adapting" glyphIgo to use subset.py from fonttools, and it seems to work well. As soon as it goes on PyPI, I will switch to it in glyphIgo.

Jun 12 '15 19:06 pettarin

Hi folks!

Thanks for making this, it's an incredibly useful tool!

Would this issue have any effect on being able to extract small capitals from a font file?

If I add the following to list.txt and run the subset command, the Capitals, and lowercase come out, but the small-caps do not:

Aᴀa
Bʙb
Cᴄc

Jan 21 '16 04:01 Rusty-UX

I am planning to update glyphIgo to use the shiny new fonttools next week or the following one. While doing that, I will think about this issue. Thank you for reporting it.

Feb 06 '16 20:02 pettarin

Just for reference, in case someone needs this feature, https://github.com/filamentgroup/glyphhanger seems to keep ligatures (I did not verify if other advance font features too).

Jun 26 '18 17:06 elmimmo

glyphIgo glyphIgo copied to clipboard

Font subsetting excludes the substitution and ligature glyphs

glyphIgo
glyphIgo copied to clipboard