opentype.js icon indicating copy to clipboard operation
opentype.js copied to clipboard

Ligatures performance

Open axkibe opened this issue 7 years ago • 4 comments

As I've been working through the ligatures code, I noticed it recreates the ligatures collection on each and every call to stringToGlyphs()

Font.prototype.stringToGlyphs = function(s, options) {

....

    if (options.features) {
        var script = options.script || this.substitution.getDefaultScriptName();
        var manyToOne = [];
        if (options.features.liga) manyToOne = manyToOne.concat(this.substitution.getFeature('liga', script, options.language));
        if (options.features.rlig) manyToOne = manyToOne.concat(this.substitution.getFeature('rlig', script, options.language));
        for (i = 0; i < length; i += 1) {
            for (var j = 0; j < manyToOne.length; j++) {

as for a given language/script these created objects doesn't change, this ought to be created only once. It should not only safe the time to create these object but also remove the strain from the garbage collector.

PS: I also noted it makes a linear search over all enabled ligatures which also IMO should either be lookup tables or a binary search.

PPS: Also options.language and options.script are undocumented features used to enable certain ligatures, for example 'ffi' in DejaVuSans-Regular is only active when options.script is set to 'latn' as it isn't enable in 'DFLT' as this is a multi-language font.

I'm currently a little confused what the difference between options.language and options.script is, so I don't see myself fit to be the one to expand the README.

axkibe avatar Apr 27 '17 07:04 axkibe

Hi @axkibe,

unfortunately the list of ligatures can change quite a lot, depending on the script, language or features selected. Maybe this list could be memoized, but checking the options to retrieve the right memoized set would not necessarily be cheaper. Memoization consumes memory, too.

The search is not optimized at all. The manyToOne list could be sorted to allow binary-search on the 1st glyph. At the time I wrote this, the list contained at most 10 ligatures on my test fonts, so there was no big win here. But I agree that sooner or later optimizing this will become necessary.

options.script is the font-defined script name, e.g. "DFLT" or "latn" options.language is the font-defined language name, e.g. "ENG " or "DEU ". (see Language system tags) The font.tables.gsub.scripts object gives you an overview of the available tags for a specific font.

fpirsch avatar Apr 27 '17 14:04 fpirsch

As I assume normal use* of stringToGlyphs() refers to a lot of calls with the same script/language/features after each other, I suppose caching the last result would be reasonable an recalculate it, whenever something new arrives, no need to cache various combos tough.

'* At least somebody that sets text. I suppose for font editors its also more or less the same to have more calls to stringToGlyphs with same features set. Would be interested in other use scenarios of opentype.js.

axkibe avatar Apr 27 '17 14:04 axkibe

Suggestion: How about we split the function into two?

one calculateOptionsSet() -- looking for a better name. and stringToGlyphs() as it is.

The API user is free to use either stringToGlyphs as it is, with calculating stuff on the fly, or provide instead of the plain 'options' object a precalculated one... and it's up to the API user to cache it.

axkibe avatar Apr 27 '17 14:04 axkibe

@axkibe can this issue be closed ?

yne avatar Jun 05 '22 14:06 yne