opentype.js
opentype.js copied to clipboard
feat: Support for Mark to Base Attachment Positioning (incl. GPOS 4, 9)
Description
Added support for Mark to Base Attachment Positioning (for the DFLT script). Feature is enabled by default because it is a part of a proper glyphs positioning. Includes capabilities implementation for the corresponding parsers (GPOS lookup 4 & 9)
Followed the current solution's convention to introduce new features within the font & layout implementation.
- added GPOS
LookupType 4
parser forMarkBasePosFormat1
support - added GPOS
LookupType 9
parser for Extension PositioningSubtableFormat1
support - added GSUB support for
multiSubstitutionFormat1
- introduced the union lookup feature (
getFeaturesLookups
) for the layout to meet the OT specs - fixed GSUB substitutions algorithm
- tests coverage for kerning and positioning features
- tests coverage for all supported and not supported lookups of GPOS
TODO: Add support for non DFLT scripts for MARK
(src/font.js:344
)
TODO: Add a support device offsets for anchor table format 3 (src/parse.js:515
)
TODO: Add MarkToMark Attachment positioning (lookup type 6) functionality.
Motivation and Context
Positioning changes
Fonts like Mandarin or Thai to successfully render text with all glyphs (sometimes 3+) representing characters like ปั
or ริ่
requiring the LookupType 4
for mark to base attachment positioning to do it properly.
Extension subtables support is required to support the subtables that exceeds the 16-bit limits of the various other offsets in the GPOS table - that genuinely are being used eg. for the Apples SF Pro fonts and many others.
Substitution changes
There was an issue with the current algorithm where it was collecting all substitutions from all lookup tables and after that each substitution was conducted. This is invalid. Because it should apply substitution for each lookup table before moving to the next lookup table. In other words: substitution should be applied before moving to the next lookup table
Specification:
During text processing, a client applies a lookup to each glyph in the string before moving to the next lookup. A lookup is finished for a glyph after the client locates the target glyph or glyph context and performs a substitution, if specified. To move to the “next” glyph, the client will typically skip all the glyphs that participated in the lookup operation: glyphs that were substituted as well as any other glyphs that formed a context for the operation.
Ref: https://learn.microsoft.com/en-us/typography/opentype/spec/gsub#table-organization
The union lookup feature
The union lookup feature (getFeaturesLookups
) was required actually to meet the OpenType specification which guarantees that all enabled features will be processed in a valid order both for positioning and substitutions.
the list of lookups is the union of lookups referenced by all of those features, and these are all processed in their lookup list order.
How Has This Been Tested?
Added tests coverage to the test/parser.js
and test/tables/gpos.js
including new binary data and testing:
- the Lookup Type 4: Mark-to-Base Attachment Positioning Subtable specification
- the LookupType 9: Extension Positioning
These lookups uses existing structures like coverage tables, and were tested agains different formats. Extension positioning lookup was tested agains all currently supported and not supported subtable parsers.
Added tests coverage to the test/layout.js
to test the method for union lookups (getFeaturesLookups
).
Added tests coverage for glyph positioning tests in the test/font.js
including:
- Kerning feature support with kerning table and lookup table support
-
options.kerning
andoptions.features.kern
testing - Mark To Base Attachment feature support to work with kern features
-
options.features.mark
testing Added tests coverage to thetest/bidi.js
to testmultiSubstitutionFormat1
glyphs decomposition
Moreover there were many test conducted for the rendered content using latn
and arab
fonts.
Screenshots (if appropriate):
Positioning issues
BEFORE (INVALID glyphs pos):
AFTER (FIXED):
Substitution issues
BEFORE (INVALID) - very end glyph is a single glyph that should be decomposed into two.
AFTER (FIXED) - decomposed into two glyph:
Types of changes
- [x] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [x] I did
npm run test
and all tests passed green (including code styling checks). - [x] I have added tests to cover my changes.
- [x] My change requires a change to the documentation.
- [x] I have updated the README accordingly.
- [x] I have read the CONTRIBUTING document.
Does this make #491 redundant, or do we have to consolidate the two? Could you please have a look at each other's code @rafallyczkowskiadylic & @AlexandreRivet?
@rafallyczkowskiadylic I noticed that rlig is missing an implementation for latin scripts and liga is missing an implementation for arabic scripts, is that also fixed in this PR?
@Connum https://github.com/opentypejs/opentype.js/pull/491 is related and probably redundant. But initially I was reviewing it and it felt like not fully covered and bounding with kern feature only. This MR covers positioning feature in first place, and GPOS 9 was required to be supported along with the GPOS 4.
This MR fully covers what was proposed in https://github.com/opentypejs/opentype.js/pull/491/files
My proposition for modus operandi:
Let's focus on https://github.com/opentypejs/opentype.js/pull/535 in first place. I will commit and cover the missing rlig
over there. By the time I will be working on that please review this MR and drop questions or any concerns so we can discuss.
@ILOVEPIE These two MRs are considered as separate chunks of features. I will fix whatever will be needed starting with the smaller one https://github.com/opentypejs/opentype.js/pull/535
Also, @rafallyczkowskiadylic does this PR fix #501, because if you're redoing this code anyways, it might be worth fixing it.
@rafallyczkowskiadylic the feature order is quite a critical thing to add before we start working on implementing more features like calt
etc. It would be great if we could merge this soon!
@Connum once we have the other MR merged (that covers rlig
that is missing here as well) I will resolve and address all concerns around this mr shortly.
Sure, that's fine! We don't have to support everything at once when implementing a new feature or format. We should just make sure that users are notified about any missing/unsupported functionality.
I just wanted to avoid that anything was misusing that was supposed to be in this PR already.
@Connum Perfect! Great you are on top. So, let's address what is pending now. We stay in touch
I will check https://github.com/opentypejs/opentype.js/issues/501
#501 is fixed within this MR
@Connum @ILOVEPIE added few more commits with the substitution algorithm fixes and a new substitution format support. I haven't refactored the other working code (just the required bits). We can find few spots where we see "similar" functionalities. For example:
-
src/features/featureQuery.js
FeatureQuery.prototype.lookupFeature
collects substitutions - too much responsibility here. I marked this as deprecated and pointed the fixed algorithm -
src/substitution.js
Substitution.prototype.getFeature
- substitution processing -
src/bidi.js
- substitution applications
I believe we could move towards:
-
Layout
- to keep what it has, consume GPOS/GSUB definitions, lookups. Keep importantgetFeaturesLookups
method that build a union lookups list for ordered processing. -
Position
andSubstitution
- shares base functionality to provide features lookups through theLayout
-
FeatureQuery
- should be promoted to use this to find and process features (not only in the bidi) - substitution features per script i.e
src/features/latn/latinLigatures.js
src/features/arab/arabicRequiredLigatures.js
should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor ofapplySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);
Not sure if bidi
is really required, maybe for the directional substitutions but I think bidi could be deprecated and it's responsibility could be dropped into the Font
where bidi is being called and where positioning logic is happening anyways. But basically I think we could use a single processor both for substitutions and positioning.
Well this can open a discussion, but wanted to share some thoughts.
I'm totally for unifying duplicate functionality. The way that feature processing is currently structured is something that has deterred me from working on new font features because it frankly seems a bit messy and not very intuitive to me. However, I would see that as a separate step from this PR. Let's get this support in first and consolidate feature processing in the future. Otherwise there might be too much code to review at once, and we have enough open and partly almost-finished PRs lying around as it is.
That's my opinion and not the final say, of course.
edit: after a quick glance at the code changes, it looks to me like you have already done most of the work described in your previous comment? Of course let's keep the work you already put in there and just move any further consolidation to somewhere down the line.
This is blocking quite a few PRs, we should probably get this merged sometime soon.
Thank you guys for the comments. Yes, most of suggestions/work has been done. I have consolidated functionality as described introducing more generic functions to handle/fix specifications. I will resolve conflicts soon.
Great, looking forward to it! Thanks for your work!
Fixes #501 according to https://github.com/opentypejs/opentype.js/issues/501#issuecomment-1543727128
@rafallyczkowskiadylic it's been a couple of weeks - just a friendly nudge. :)
I was finally able to test this more thoroughly... I merged it into my experimental fork after resolving some conflicts.
The chars in the GPOS–3 test of the unicode test suite still rendered wrong, and they also didn't work in our demo page.
So I had to make a few changes to make it work, see https://github.com/Connum/opentype.js-next/commit/95688c6f54ec55274cfe2eee86eb8493788ca632
Hey presto!
However, these calculations are quite expensive, even before my additions, because they are done for each single glyph. I don't think that's necessary and we should implement ways to at least cache this, or best only call the feature update when something has changed.
Ok, now the tests for disabling the mark feature are failing of course, because I implemented it so that default features cannot be disabled. We'll have to talk about how we want to handle this.
The more I look at this, the more confused I get... 😅
I think we still need some cleaning up before we can finally merge this, but I'm lost here without you @rafallyczkowskiadylic. I tried to fix https://rawgit.com/unicode-org/text-rendering-tests/master/reports/OpenType.js.html#GSUB-1, but I got caught up in spaghetti code for several hours trying to follow the path of the substitution mechanism and got confused between old and new functionality. e.g. we should really get rid of this:
- substitution features per script i.e
src/features/latn/latinLigatures.js
src/features/arab/arabicRequiredLigatures.js
should not be required since mostly they are simply the same. We basically need word ranges and context, but these are the same methods. These can be removed in favor ofapplySubstitutions.call(this, 'thai', ['liga', 'rlig', 'ccmp']);
Regarding this substitution case, I started wondering why we split into words in the first place? The space glyph being part of a substitution is perfectly fine, there's no rule against it to my knowledge.
I know that's the crux of open source, but I wouldn't be such a pest if it wasn't blocking the progress so much. A clear structure for substitution and positioning as well as supporting all the different formats is really crucial for implementing further features.
By the way, trying to get rid of the latn-specific applyLatinLigatures
, I replaced it with
function applyLatinFeatures() {
checkGlyphIndexStatus.call(this);
applySubstitutions.call(this, 'latn', ['liga', 'rlig', 'ccmp', 'calt']);
}
in analogy to applyThaiFeatures()
, and called it in Bidi.prototype.applyFeaturesToContexts
:
if (this.checkContextReady('latinWord')) {
applyLatinFeatures.call(this);
}
But then all ligatures stopped working...
Hey @Connum I am on it. Let me resolve conflicts and see what is on the code. I will revert back by the end of the weekend
@Connum Conflicts resolved.
Pending items:
- Could you share this
TestShapeEthi.ttf
file with me so I can verify the outcomes? - I understand. I had similar issue, with some redundancy to get into the logic, that is why I didn't wanted to deprecate old code but rather leverage a new core lookup features. The most important functions is: Layout.getFeaturesLookups that implements common logic for fetching features, for substitutions and for positioning.
I started wondering why we split into words in the first place?
It was implemented like that, and the reason for this is that within single sentence there may be characters from a different language context. Example: 我的iPhone是最好的
where "iPhone" is for the latin context word.
- For the
applyLatinLigatures
issue, could you send me specific case you are testing? Including exact string, expected result, and source font file. I will check what may be happening because I see no issues with the latin fonts I am testing currently.
Moreover, please keep in mind supported substitution formats.
/**
* Supported substitutions
*/
const SUBSTITUTIONS = {
11: singleSubstitutionFormat1,
12: singleSubstitutionFormat2,
63: chainingSubstitutionFormat3,
41: ligatureSubstitutionFormat1,
21: multiSubstitutionFormat1
};
Different fonts' source files may have these definitions in a different formats that eventually may not be supported yet.
@Connum if you could share all your use cases, with results you expect to see and source font files - I can review all of them and eventually address.
For the clean up I think we could just mark functions as deprecated for now. I'd refactor only bits covered with reliable tests so we can actually rely on outcomes before/after we change the code (this was also the reason I'd rather applied the code and "promoted" than removing or refactoring old functioning pieces).
Is there easy way to tests svg using demo page? I wrapped up simple compiler so I am using this a
npm run font:preview -- --font="Roboto-Black.ttf" --content='Lorem ipsum dolor sit amet. ąłćżółźżćśćłóęą '
that generates preview.html
that compares HTML (browser) rendered content and SVG path.
Thanks for your work!
You can check out the unicode test suite: https://github.com/unicode-org/text-rendering-tests
It does exactly what you proposed, compare SVG output with expected results. You can replace the opentype.js files in the node_modules folder with the files compiled from your dev branch (also the files in /bin/) to see if everything related to GSUB passes.
The repo also provides the test files. If you're able to fix this with this PR, you should copy the file to our tests folder (it's licensed under OFL) and add our own test. But it's OK if we just do the refactoring and mark to base first if it holds us up too much. But I'll definitely need help with that space thing as well. :)
Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR
Ok. @Connum currently I do not have pending items, unless you share your failing test cases (font names, content tested, etc) with me according to this so I can review and address within this MR
See my reply! 😉 But it's OK if we stick to GPOS mark-to-base with this PR and tackle the rest later. I'll review the code tomorrow.
I don't really like the idea of having both approaches in the code base in parallel. The arab and latn features should work with the new approach as well, so we should move everything to work with it. I understand if this is more than you're willing to do. Could you mark the PR as allowing maintainers to make changes? Then me or someone else could look into fixing this for all the scripts.
Shouldn't I be able to change applyLatinLigatures()
to
function applyLatinLigatures() {
applySubstitutions.call(this, 'latn', ['liga']);
}
and it should just work? But the ligature tests then fail.
Shouldn't I be able to change
applyLatinLigatures()
tofunction applyLatinLigatures() { applySubstitutions.call(this, 'latn', ['liga']); }
and it should just work? But the ligature tests then fail.
Actually I checked and found few tests failing for your cases, for latn
they are related to manipulating font's definitions (Substitution.prototype.addLigature
). These implementations are still using old lookupTables search, and manipulating local references (
In other word, addLigature is using Layout.getLookupTables
, manipulates this method's output references (expecting the test will also use .getLookupTables
method). Thing is my solution does not use .getLookupTables
at all because of getFeaturesLookups
Substitution.prototype.addLigature = function(feature, ligature, script, language) {
const lookupTable = this.getLookupTables(script, language, feature, 4, true)[0];
let subtable = lookupTable.subtables[0];
...
subtable.ligatureSets = .... // <- manipulation
});
yet Layout.getLookupTables
creates own copy of subtables refs (that is why we can't see them within new feature)
If .addLigrature
would instead manipulate definition on the this.font.tables[this.tableName].lookups
level (not local references) everything will work as expected. This is how I would approach this, to deprecate any use of getLookupTables
and if there are definitions updated I would try to make them as "high" as possible (ideally around specific table def source, this.font.tables[this.tableName]....
)