ufo-spec
ufo-spec copied to clipboard
Better anchor definition
For ligature anchors, many UFOs designed with one authoring tool don’t work with other authoring tools as they have different ways of storing this information. Some authoring tools expect specific suffixes (like _1
, _2
,... or #1
, #2
,...) while others expect specific prefixes. It would be better to standardize this, either in the name or preferably with an attribute (for example ligatureIndex
).
/cc @graphicore @khaledhosny @jamesgk
Related questions:
- How do we know, looking at a UFO source, which glyph is of type Ligature, Mark, Base, other ... ?
- How many parts is a Ligature-Glyph made of? We need that to write correct mark2liga features, also for ligature carets (also needs position information). IMHO, there's nothing in UFO that helps.
@graphicore I think another issue can be opened for a way to define ligature caret. The point of this issue is ligature anchors.
There are also cursive anchors, but in general how do you tell what type a given anchor is, base, mark, ligature, entry, exit. I don’t see how one would be able to use of the anchors without knowing such essential information about it.
@moyogo the point is that you need the type of the glyph -- ligature -- and how many components it has, for both.
@khaledhosny right. I think Glyphs uses a naming scheme where name
is the base anchor and _name
is the matching mark anchor. For ligature anchors it would use _name_1
, _name_2
etc, afaik, may be not fully correct though. For entry and exit it probably uses just reserved names like "entry" and "exit". I'm not saying its the best solution, but naming conventions could be a solution.
I've done a lot of work with using glyph names to indicate feature states and behavior. Pretty much every fancy OpenType font that I've written code for has had a specific naming scheme and a script that interprets it into .fea. The names are ultimately not very flexible and they are really cumbersome for designers. It's not fun to type /o.010340/o.120140 when you just want to compare a couple of "o" designs.
Anyway, I think a much more future proof solution would be to publicly define a structure that can be stored in the lib that defines GSUB/GPOS behavior for a particular glyph. That would open up a huge number of interesting possibilities. But... How far does it go? Do we define what the generated .fea should look like? Or, do we stop at saying, "this is a ligature and here's some data about it" and "this is a mark"? This is going to get really deep, really fast. Even more problematic, how is a tool supposed to know when something that exists in .fea should be replaced with auto-generated .fea code and when it should be left alone because the designer made an edit?
Don't get me wrong, I think some sort of data structure that can deeply describe the intended use of a glyph would be really useful. Defining how that is used is something that I'm not sure we can get consensus on.
- Would Anchor become more usable if it had an optional
type
attribute, that contains a (standardized) string? - Would defining
glyph.lib["public.*"]
keys for glyph type and ligature component count be helpful?
That would be very useful. As I mentioned in a fontParts issue, having an anchor lib also would be nice to add other information like context of mark positioning. This context information can be stored in glyph lib now, but by removing the anchor, any authoring tool should also remove that information from glyph lib. If there will be an anchor type property I prefer to store the information there instead of glyph lib. I guess I can store more elaborate data structure using XML there if it's string but not sure if this is considered standardized or not.
As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.
Regarding anchor.type
: what are the needed values for such a field? In the above comments I read:
- base
- mark
- ligature
- entry
- exit
Is that correct and complete?
Likewise for the type of a glyph?
As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.
I will probably use this type property then.
Is that correct and complete?
I think so. But I don't know what's the use for ligature type though. Example in Arabic is that it's either base or mark. On the ligatures the base anchor gets an index for the order.
Likewise for the type of a glyph?
Only from this set:
- base
- ligature
- mark
- components
https://docs.microsoft.com/en-us/typography/opentype/spec/gdef#glyph-class-definition-table-overview
(Just to be clear for anchor type) Anchor type can only be:
- base
- mark
- entry
- exit
Base anchor can get an index attribute for the logical order of mark in a ligature.
Actually in adobe fea file a ligature mark and base mark are defined differently.
- Base:
pos base [behDotless-ar] <anchor 428 -5> mark @mark_bottom <anchor 438 368> mark @mark_top;
- Ligature:
pos ligature [lam_alefWasla-ar.fina] <anchor 473 -3> mark @mark_bottom <anchor 492 726> mark @mark_top ligComponent <anchor 139 -3> mark @mark_bottom <anchor 173 893> mark @mark_top;
But personally I don't need to know if an anchor is ligature type when I generate the feature. I just check if it's a base and if it has an index.
Current solution to prevent adding any property to anchor is having name conventions for anchors. The advantage of naming convention is that the intention of the anchor is visible. The name corresponds to the anchor mark that is going to be placed on the base. So if a top
mark is going to be placed on a base, in the base glyph, the base anchor is called top
and in the mark glyph, the mark anchor is called _top
. This is the naming convention from the Glyphs app and I think it's also used in fontmake. For a ligature, the anchor index is written after the name so it becomes top_1
. For entry and exit a prefix is needed to differentiate it from the mark anchor. The prefix for mark anchor is _
, the prefix for cursive anchor is #
so it becomes #entry
or #exit
.
See also: https://github.com/googlefonts/ufo2ft/issues/330#issuecomment-489126757
Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:
- What are the weaknesses of this scheme?
- How could the procedure (of generating such features) benefit from UFO enhancements?
In other words:
- What is the problem?
- How does the current solution work?
- How is the current solution not adequate?
- Can the UFO format be enhanced to solve said inadequacies? If so, how?
Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:
I will do that, thank you for pointing them out.
One note for now. If the glyph type is mark, it could include both mark and base anchors. In a glyph that is defined as a mark, the base anchor is used for mark to mark positioning. Takeaway: anchor type cannot always be inferred from its glyph type.
One limitation of the above scheme is that it allows only one cursive anchor/lookup in the entire font.
Could you give an example where there is a need for multiple lookups for cursive anchor?
OpenType allows it, I don’t see why we should have such a limitation otherwise. I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.
I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.
Good point. Thank you!
This also adds the question, do we need a RTL
property for anchors?
Is this https://github.com/googlefonts/ufo2ft/issues/303 ?
@khaledhosny, are there any other fundamental problems that are good to consider up front?
No, another font that I didn’t convert to UFO since fontmake does not support cursive anchors at all.
May main problem with UFO anchors right now is that there behavior is unspecified. Tools like fontmake and ufo2ft try to follow Glyphs but 1) Glyphs is not a UFO editor 2) Its behavior is not specified either 3) it has arbitrary limitations (like supporting only one cursive anchor per font).
I prefer being explicit than implicit, so I think type
, index
and flags
attributes would (potentially) allow the compiler to build any kind of mark positioning lookup supported by OpenType without having to do lots of guesswork. The new attributes should be optional so that people who prefer the current behavior can continue doing so without disruption.
One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.
One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.
This is the main limitation of anchors for me in UFO. A context cannot be written in the name. I would love to have the option to write it in the anchor but there is None atm.
A possible solution for future, adding an attributes dictionary to the anchor with (standardized) string keys?
A solution for now, write all the related info for mark and cursive positioning inside the glyph lib and define the spec. An authoring tool (e.g. RF extension) is required to show the preview, change its attributes and to compile it to features.
Another disadvantage for anchor naming scheme, the intention of the designer is unclear. The anchor naming scheme for generating mark positioning is exactly similar to build precomposed accented glyphs. Is the anchor made to generate mark positioning or just to build precomposed accented glyphs? This could lead to unnecessary extra data in the final binary. An anchor that is used for glyph construction is not necessarily supposed to be used for mark positioning and position of a precomposed mark can be different from a mark that is going to be placed using mark positioning on the actual base glyph (very common in Arabic). One solution, a new data structure only for generating marks and cursive attachment that could be saved in glyph lib and also to segregate these data.
Is the anchor made to generate mark positioning or just to build precomposed accented glyphs?
That's an interesting question.
On the one hand: well, maybe it is cool to have mark features for (latin) accents? On the other: that seems rather accidental, and is not controllable enough.
But anchor.type
could be used to distinguish them, if it existed, so this is nice argument in favor of that.
On the one hand: well, maybe it is cool to have mark features for (Latin) accents?
I'm also wondering. In Arabic, mark positioning is needed on any type of letter (whether precomposed or not) but in Latin:
- How often letters+accent(s) combinations are typed?
- Which layout engines override these combinations and replace it with a precomposed glyph that already exists in the font?
Glyphs app uses the anchor naming scheme to generate mark positioning and also to build composites. There are situations where a composite glyph doesn't have any anchors but Glyphs app burrows mark positioning from baseGlyph(s) and duplicates them in the composite during compile. The user even can't see these virtual marks positionings in the glyph view. The generated mark feature for Latin could become as huge or even bigger than Arabic. I wonder if this huge data could cause other issues? I would appreciate @behdad and/or @anthrotype insight into these questions.
In my experience, mark positioning is not equal to accented composites.
Since I don't know much about Devanagari I would also appreciate @tiroj insights. That would help to address any current issues with anchor definition or mark positioning in UFO.