opentype.js icon indicating copy to clipboard operation
opentype.js copied to clipboard

feat: Support for Mark to Base Attachment Positioning (incl. GPOS 4, 9)

Open rafallyczkowskiadylic opened this issue 2 years ago • 29 comments

Description

Added support for Mark to Base Attachment Positioning (for the DFLT script). Feature is enabled by default because it is a part of a proper glyphs positioning. Includes capabilities implementation for the corresponding parsers (GPOS lookup 4 & 9)

Followed the current solution's convention to introduce new features within the font & layout implementation.

  • added GPOS LookupType 4 parser for MarkBasePosFormat1 support
  • added GPOS LookupType 9 parser for Extension Positioning SubtableFormat1 support
  • added GSUB support for multiSubstitutionFormat1
  • introduced the union lookup feature (getFeaturesLookups) for the layout to meet the OT specs
  • fixed GSUB substitutions algorithm
  • tests coverage for kerning and positioning features
  • tests coverage for all supported and not supported lookups of GPOS

TODO: Add support for non DFLT scripts for MARK (src/font.js:344) TODO: Add a support device offsets for anchor table format 3 (src/parse.js:515) TODO: Add MarkToMark Attachment positioning (lookup type 6) functionality.

Motivation and Context

Positioning changes

Fonts like Mandarin or Thai to successfully render text with all glyphs (sometimes 3+) representing characters like ปั or ริ่ requiring the LookupType 4 for mark to base attachment positioning to do it properly.

Extension subtables support is required to support the subtables that exceeds the 16-bit limits of the various other offsets in the GPOS table - that genuinely are being used eg. for the Apples SF Pro fonts and many others.

Substitution changes

There was an issue with the current algorithm where it was collecting all substitutions from all lookup tables and after that each substitution was conducted. This is invalid. Because it should apply substitution for each lookup table before moving to the next lookup table. In other words: substitution should be applied before moving to the next lookup table

Specification:

During text processing, a client applies a lookup to each glyph in the string before moving to the next lookup. A lookup is finished for a glyph after the client locates the target glyph or glyph context and performs a substitution, if specified. To move to the “next” glyph, the client will typically skip all the glyphs that participated in the lookup operation: glyphs that were substituted as well as any other glyphs that formed a context for the operation.

Ref: https://learn.microsoft.com/en-us/typography/opentype/spec/gsub#table-organization

The union lookup feature

The union lookup feature (getFeaturesLookups) was required actually to meet the OpenType specification which guarantees that all enabled features will be processed in a valid order both for positioning and substitutions.

the list of lookups is the union of lookups referenced by all of those features, and these are all processed in their lookup list order.

How Has This Been Tested?

Added tests coverage to the test/parser.js and test/tables/gpos.js including new binary data and testing:

These lookups uses existing structures like coverage tables, and were tested agains different formats. Extension positioning lookup was tested agains all currently supported and not supported subtable parsers.

Added tests coverage to the test/layout.js to test the method for union lookups (getFeaturesLookups). Added tests coverage for glyph positioning tests in the test/font.js including:

  • Kerning feature support with kerning table and lookup table support
  • options.kerning and options.features.kern testing
  • Mark To Base Attachment feature support to work with kern features
  • options.features.mark testing Added tests coverage to the test/bidi.js to test multiSubstitutionFormat1 glyphs decomposition

Moreover there were many test conducted for the rendered content using latn and arab fonts.

Screenshots (if appropriate):

Positioning issues

BEFORE (INVALID glyphs pos): image

AFTER (FIXED): image

Substitution issues

BEFORE (INVALID) - very end glyph is a single glyph that should be decomposed into two. image

AFTER (FIXED) - decomposed into two glyph: image

image

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [x] I did npm run test and all tests passed green (including code styling checks).
  • [x] I have added tests to cover my changes.
  • [x] My change requires a change to the documentation.
  • [x] I have updated the README accordingly.
  • [x] I have read the CONTRIBUTING document.

rafallyczkowskiadylic avatar Feb 06 '23 13:02 rafallyczkowskiadylic

Does this make #491 redundant, or do we have to consolidate the two? Could you please have a look at each other's code @rafallyczkowskiadylic & @AlexandreRivet?

Connum avatar Feb 06 '23 18:02 Connum

@rafallyczkowskiadylic I noticed that rlig is missing an implementation for latin scripts and liga is missing an implementation for arabic scripts, is that also fixed in this PR?

ILOVEPIE avatar Feb 06 '23 20:02 ILOVEPIE

@Connum https://github.com/opentypejs/opentype.js/pull/491 is related and probably redundant. But initially I was reviewing it and it felt like not fully covered and bounding with kern feature only. This MR covers positioning feature in first place, and GPOS 9 was required to be supported along with the GPOS 4.

This MR fully covers what was proposed in https://github.com/opentypejs/opentype.js/pull/491/files

My proposition for modus operandi: Let's focus on https://github.com/opentypejs/opentype.js/pull/535 in first place. I will commit and cover the missing rlig over there. By the time I will be working on that please review this MR and drop questions or any concerns so we can discuss.

@ILOVEPIE These two MRs are considered as separate chunks of features. I will fix whatever will be needed starting with the smaller one https://github.com/opentypejs/opentype.js/pull/535

rafallyczkowskiadylic avatar Feb 07 '23 11:02 rafallyczkowskiadylic

Also, @rafallyczkowskiadylic does this PR fix #501, because if you're redoing this code anyways, it might be worth fixing it.

ILOVEPIE avatar Feb 07 '23 20:02 ILOVEPIE

@rafallyczkowskiadylic the feature order is quite a critical thing to add before we start working on implementing more features like calt etc. It would be great if we could merge this soon!

Connum avatar Mar 02 '23 22:03 Connum

@Connum once we have the other MR merged (that covers rlig that is missing here as well) I will resolve and address all concerns around this mr shortly.

rafallyczkowskiadylic avatar Mar 03 '23 17:03 rafallyczkowskiadylic

Sure, that's fine! We don't have to support everything at once when implementing a new feature or format. We should just make sure that users are notified about any missing/unsupported functionality.

I just wanted to avoid that anything was misusing that was supposed to be in this PR already.

Connum avatar Mar 03 '23 18:03 Connum

@Connum Perfect! Great you are on top. So, let's address what is pending now. We stay in touch

rafallyczkowskiadylic avatar Mar 03 '23 18:03 rafallyczkowskiadylic

I will check https://github.com/opentypejs/opentype.js/issues/501

rafallyczkowskiadylic avatar Mar 07 '23 08:03 rafallyczkowskiadylic

#501 is fixed within this MR

rafallyczkowskiadylic avatar May 11 '23 10:05 rafallyczkowskiadylic

@Connum @ILOVEPIE added few more commits with the substitution algorithm fixes and a new substitution format support. I haven't refactored the other working code (just the required bits). We can find few spots where we see "similar" functionalities. For example:

  1. src/features/featureQuery.js FeatureQuery.prototype.lookupFeature collects substitutions - too much responsibility here. I marked this as deprecated and pointed the fixed algorithm
  2. src/substitution.js Substitution.prototype.getFeature - substitution processing
  3. src/bidi.js - substitution applications

I believe we could move towards:

  • Layout - to keep what it has, consume GPOS/GSUB definitions, lookups. Keep important getFeaturesLookups method that build a union lookups list for ordered processing.
  • Position and Substitution - shares base functionality to provide features lookups through the Layout
  • FeatureQuery - should be promoted to use this to find and process features (not only in the bidi)
  • substitution features per script i.e src/features/latn/latinLigatures.js src/features/arab/arabicRequiredLigatures.js should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor of applySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);

Not sure if bidi is really required, maybe for the directional substitutions but I think bidi could be deprecated and it's responsibility could be dropped into the Font where bidi is being called and where positioning logic is happening anyways. But basically I think we could use a single processor both for substitutions and positioning.

Well this can open a discussion, but wanted to share some thoughts.

rafallyczkowskiadylic avatar May 12 '23 10:05 rafallyczkowskiadylic

I'm totally for unifying duplicate functionality. The way that feature processing is currently structured is something that has deterred me from working on new font features because it frankly seems a bit messy and not very intuitive to me. However, I would see that as a separate step from this PR. Let's get this support in first and consolidate feature processing in the future. Otherwise there might be too much code to review at once, and we have enough open and partly almost-finished PRs lying around as it is.

That's my opinion and not the final say, of course.

edit: after a quick glance at the code changes, it looks to me like you have already done most of the work described in your previous comment? Of course let's keep the work you already put in there and just move any further consolidation to somewhere down the line.

Connum avatar Oct 19 '23 10:10 Connum

This is blocking quite a few PRs, we should probably get this merged sometime soon.

ILOVEPIE avatar Oct 29 '23 20:10 ILOVEPIE

Thank you guys for the comments. Yes, most of suggestions/work has been done. I have consolidated functionality as described introducing more generic functions to handle/fix specifications. I will resolve conflicts soon.

rafallyczkowskiadylic avatar Nov 01 '23 14:11 rafallyczkowskiadylic

Great, looking forward to it! Thanks for your work!

Connum avatar Nov 01 '23 15:11 Connum

Fixes #501 according to https://github.com/opentypejs/opentype.js/issues/501#issuecomment-1543727128

@rafallyczkowskiadylic it's been a couple of weeks - just a friendly nudge. :)

Connum avatar Nov 25 '23 18:11 Connum

I was finally able to test this more thoroughly... I merged it into my experimental fork after resolving some conflicts.

The chars in the GPOS–3 test of the unicode test suite still rendered wrong, and they also didn't work in our demo page. image image

So I had to make a few changes to make it work, see https://github.com/Connum/opentype.js-next/commit/95688c6f54ec55274cfe2eee86eb8493788ca632

image image Hey presto!

However, these calculations are quite expensive, even before my additions, because they are done for each single glyph. I don't think that's necessary and we should implement ways to at least cache this, or best only call the feature update when something has changed.

Connum avatar Nov 29 '23 01:11 Connum

Ok, now the tests for disabling the mark feature are failing of course, because I implemented it so that default features cannot be disabled. We'll have to talk about how we want to handle this.

Connum avatar Nov 29 '23 01:11 Connum

The more I look at this, the more confused I get... 😅

I think we still need some cleaning up before we can finally merge this, but I'm lost here without you @rafallyczkowskiadylic. I tried to fix https://rawgit.com/unicode-org/text-rendering-tests/master/reports/OpenType.js.html#GSUB-1, but I got caught up in spaghetti code for several hours trying to follow the path of the substitution mechanism and got confused between old and new functionality. e.g. we should really get rid of this:

  • substitution features per script i.e src/features/latn/latinLigatures.js src/features/arab/arabicRequiredLigatures.js should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor of applySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);

Regarding this substitution case, I started wondering why we split into words in the first place? The space glyph being part of a substitution is perfectly fine, there's no rule against it to my knowledge.

I know that's the crux of open source, but I wouldn't be such a pest if it wasn't blocking the progress so much. A clear structure for substitution and positioning as well as supporting all the different formats is really crucial for implementing further features.

Connum avatar Nov 30 '23 19:11 Connum

By the way, trying to get rid of the latn-specific applyLatinLigatures, I replaced it with

function applyLatinFeatures() {
    checkGlyphIndexStatus.call(this);
    applySubstitutions.call(this, 'latn', ['liga', 'rlig', 'ccmp', 'calt']);
}

in analogy to applyThaiFeatures(), and called it in Bidi.prototype.applyFeaturesToContexts:

    if (this.checkContextReady('latinWord')) {
        applyLatinFeatures.call(this);
    }

But then all ligatures stopped working...

Connum avatar Nov 30 '23 19:11 Connum

Hey @Connum I am on it. Let me resolve conflicts and see what is on the code. I will revert back by the end of the weekend

rafallyczkowskiadylic avatar Dec 02 '23 08:12 rafallyczkowskiadylic

@Connum Conflicts resolved.

Pending items:

  1. Could you share this TestShapeEthi.ttf file with me so I can verify the outcomes?
  2. I understand. I had similar issue, with some redundancy to get into the logic, that is why I didn't wanted to deprecate old code but rather leverage a new core lookup features. The most important functions is: Layout.getFeaturesLookups that implements common logic for fetching features, for substitutions and for positioning.

I started wondering why we split into words in the first place?

It was implemented like that, and the reason for this is that within single sentence there may be characters from a different language context. Example: 我的iPhone是最好的 where "iPhone" is for the latin context word.

  1. For the applyLatinLigatures issue, could you send me specific case you are testing? Including exact string, expected result, and source font file. I will check what may be happening because I see no issues with the latin fonts I am testing currently.

Moreover, please keep in mind supported substitution formats.

/**
 * Supported substitutions
 */
const SUBSTITUTIONS = {
    11: singleSubstitutionFormat1,
    12: singleSubstitutionFormat2,
    63: chainingSubstitutionFormat3,
    41: ligatureSubstitutionFormat1, 
    21: multiSubstitutionFormat1
};

Different fonts' source files may have these definitions in a different formats that eventually may not be supported yet.


@Connum if you could share all your use cases, with results you expect to see and source font files - I can review all of them and eventually address.


For the clean up I think we could just mark functions as deprecated for now. I'd refactor only bits covered with reliable tests so we can actually rely on outcomes before/after we change the code (this was also the reason I'd rather applied the code and "promoted" than removing or refactoring old functioning pieces).

rafallyczkowskiadylic avatar Dec 02 '23 14:12 rafallyczkowskiadylic

Is there easy way to tests svg using demo page? I wrapped up simple compiler so I am using this a

npm run font:preview -- --font="Roboto-Black.ttf" --content='Lorem ipsum dolor sit amet. ąłćżółźżćśćłóęą '

that generates preview.html that compares HTML (browser) rendered content and SVG path.

image

rafallyczkowskiadylic avatar Dec 02 '23 14:12 rafallyczkowskiadylic

Thanks for your work!

You can check out the unicode test suite: https://github.com/unicode-org/text-rendering-tests

It does exactly what you proposed, compare SVG output with expected results. You can replace the opentype.js files in the node_modules folder with the files compiled from your dev branch (also the files in /bin/) to see if everything related to GSUB passes.

The repo also provides the test files. If you're able to fix this with this PR, you should copy the file to our tests folder (it's licensed under OFL) and add our own test. But it's OK if we just do the refactoring and mark to base first if it holds us up too much. But I'll definitely need help with that space thing as well. :)

Connum avatar Dec 02 '23 15:12 Connum

Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR

rafallyczkowskiadylic avatar Dec 03 '23 09:12 rafallyczkowskiadylic

Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR

See my reply! 😉 But it's OK if we stick to GPOS mark-to-base with this PR and tackle the rest later. I'll review the code tomorrow.

Connum avatar Dec 03 '23 09:12 Connum

I don't really like the idea of having both approaches in the code base in parallel. The arab and latn features should work with the new approach as well, so we should move everything to work with it. I understand if this is more than you're willing to do. Could you mark the PR as allowing maintainers to make changes? Then me or someone else could look into fixing this for all the scripts.

Connum avatar Dec 05 '23 15:12 Connum

Shouldn't I be able to change applyLatinLigatures() to

function applyLatinLigatures() {
    applySubstitutions.call(this, 'latn', ['liga']);
}

and it should just work? But the ligature tests then fail.

Connum avatar Dec 05 '23 16:12 Connum

Shouldn't I be able to change applyLatinLigatures() to

function applyLatinLigatures() {
    applySubstitutions.call(this, 'latn', ['liga']);
}

and it should just work? But the ligature tests then fail.

Actually I checked and found few tests failing for your cases, for latn they are related to manipulating font's definitions (Substitution.prototype.addLigature). These implementations are still using old lookupTables search, and manipulating local references (

In other word, addLigature is using Layout.getLookupTables, manipulates this method's output references (expecting the test will also use .getLookupTables method). Thing is my solution does not use .getLookupTables at all because of getFeaturesLookups

Substitution.prototype.addLigature = function(feature, ligature, script, language) {
    const lookupTable = this.getLookupTables(script, language, feature, 4, true)[0];
    let subtable = lookupTable.subtables[0];
    ...
    subtable.ligatureSets = .... // <- manipulation
});

yet Layout.getLookupTables creates own copy of subtables refs (that is why we can't see them within new feature)

image

If .addLigrature would instead manipulate definition on the this.font.tables[this.tableName].lookups level (not local references) everything will work as expected. This is how I would approach this, to deprecate any use of getLookupTables and if there are definitions updated I would try to make them as "high" as possible (ideally around specific table def source, this.font.tables[this.tableName]....)

rafallyczkowskiadylic avatar Dec 10 '23 14:12 rafallyczkowskiadylic