ufo-spec Better anchor definition

For ligature anchors, many UFOs designed with one authoring tool don’t work with other authoring tools as they have different ways of storing this information. Some authoring tools expect specific suffixes (like _1, _2,... or #1, #2,...) while others expect specific prefixes. It would be better to standardize this, either in the name or preferably with an attribute (for example ligatureIndex).

/cc @graphicore @khaledhosny @jamesgk

Apr 23 '16 20:04 moyogo

Related questions:

How do we know, looking at a UFO source, which glyph is of type Ligature, Mark, Base, other ... ?
How many parts is a Ligature-Glyph made of? We need that to write correct mark2liga features, also for ligature carets (also needs position information). IMHO, there's nothing in UFO that helps.

Apr 23 '16 20:04 graphicore

@graphicore I think another issue can be opened for a way to define ligature caret. The point of this issue is ligature anchors.

Apr 23 '16 21:04 moyogo

There are also cursive anchors, but in general how do you tell what type a given anchor is, base, mark, ligature, entry, exit. I don’t see how one would be able to use of the anchors without knowing such essential information about it.

Apr 23 '16 22:04 khaledhosny

@moyogo the point is that you need the type of the glyph -- ligature -- and how many components it has, for both.

@khaledhosny right. I think Glyphs uses a naming scheme where name is the base anchor and _name is the matching mark anchor. For ligature anchors it would use _name_1, _name_2 etc, afaik, may be not fully correct though. For entry and exit it probably uses just reserved names like "entry" and "exit". I'm not saying its the best solution, but naming conventions could be a solution.

Apr 23 '16 23:04 graphicore

I've done a lot of work with using glyph names to indicate feature states and behavior. Pretty much every fancy OpenType font that I've written code for has had a specific naming scheme and a script that interprets it into .fea. The names are ultimately not very flexible and they are really cumbersome for designers. It's not fun to type /o.010340/o.120140 when you just want to compare a couple of "o" designs.

Anyway, I think a much more future proof solution would be to publicly define a structure that can be stored in the lib that defines GSUB/GPOS behavior for a particular glyph. That would open up a huge number of interesting possibilities. But... How far does it go? Do we define what the generated .fea should look like? Or, do we stop at saying, "this is a ligature and here's some data about it" and "this is a mark"? This is going to get really deep, really fast. Even more problematic, how is a tool supposed to know when something that exists in .fea should be replaced with auto-generated .fea code and when it should be left alone because the designer made an edit?

Don't get me wrong, I think some sort of data structure that can deeply describe the intended use of a glyph would be really useful. Defining how that is used is something that I'm not sure we can get consensus on.

Apr 24 '16 02:04 typesupply

Would Anchor become more usable if it had an optional type attribute, that contains a (standardized) string?
Would defining glyph.lib["public.*"] keys for glyph type and ligature component count be helpful?

May 04 '19 06:05 justvanrossum

That would be very useful. As I mentioned in a fontParts issue, having an anchor lib also would be nice to add other information like context of mark positioning. This context information can be stored in glyph lib now, but by removing the anchor, any authoring tool should also remove that information from glyph lib. If there will be an anchor type property I prefer to store the information there instead of glyph lib. I guess I can store more elaborate data structure using XML there if it's string but not sure if this is considered standardized or not.

May 06 '19 08:05 typoman

As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.

Regarding anchor.type: what are the needed values for such a field? In the above comments I read:

base
mark
ligature
entry
exit

Is that correct and complete?

Likewise for the type of a glyph?

May 06 '19 09:05 justvanrossum

As explained in the issue you reference, adding lib to anchor is problematic, so please focus on solutions that don't require that.

I will probably use this type property then.

Is that correct and complete?

I think so. But I don't know what's the use for ligature type though. Example in Arabic is that it's either base or mark. On the ligatures the base anchor gets an index for the order.

May 06 '19 09:05 typoman

Likewise for the type of a glyph?

Only from this set:

base
ligature
mark
components

https://docs.microsoft.com/en-us/typography/opentype/spec/gdef#glyph-class-definition-table-overview

May 06 '19 09:05 typoman

(Just to be clear for anchor type) Anchor type can only be:

base
mark
entry
exit

Base anchor can get an index attribute for the logical order of mark in a ligature.

May 06 '19 10:05 typoman

Actually in adobe fea file a ligature mark and base mark are defined differently.

Base: pos base [behDotless-ar] <anchor 428 -5> mark @mark_bottom <anchor 438 368> mark @mark_top;
Ligature: pos ligature [lam_alefWasla-ar.fina] <anchor 473 -3> mark @mark_bottom <anchor 492 726> mark @mark_top ligComponent <anchor 139 -3> mark @mark_bottom <anchor 173 893> mark @mark_top;

But personally I don't need to know if an anchor is ligature type when I generate the feature. I just check if it's a base and if it has an index.

May 06 '19 10:05 typoman

Current solution to prevent adding any property to anchor is having name conventions for anchors. The advantage of naming convention is that the intention of the anchor is visible. The name corresponds to the anchor mark that is going to be placed on the base. So if a top mark is going to be placed on a base, in the base glyph, the base anchor is called top and in the mark glyph, the mark anchor is called _top. This is the naming convention from the Glyphs app and I think it's also used in fontmake. For a ligature, the anchor index is written after the name so it becomes top_1. For entry and exit a prefix is needed to differentiate it from the mark anchor. The prefix for mark anchor is _, the prefix for cursive anchor is # so it becomes #entry or #exit.

May 06 '19 12:05 typoman

See also: https://github.com/googlefonts/ufo2ft/issues/330#issuecomment-489126757

Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:

What are the weaknesses of this scheme?
How could the procedure (of generating such features) benefit from UFO enhancements?

In other words:

What is the problem?
How does the current solution work?
How is the current solution not adequate?
Can the UFO format be enhanced to solve said inadequacies? If so, how?

May 06 '19 12:05 justvanrossum

Suggestion: Describe/document the Glyphs/fontmake behavior into more detail. It may then become easier to reason about the following:

I will do that, thank you for pointing them out.

One note for now. If the glyph type is mark, it could include both mark and base anchors. In a glyph that is defined as a mark, the base anchor is used for mark to mark positioning. Takeaway: anchor type cannot always be inferred from its glyph type.

May 06 '19 13:05 typoman

One limitation of the above scheme is that it allows only one cursive anchor/lookup in the entire font.

May 06 '19 13:05 khaledhosny

Could you give an example where there is a need for multiple lookups for cursive anchor?

May 06 '19 13:05 typoman

OpenType allows it, I don’t see why we should have such a limitation otherwise. I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.

May 06 '19 13:05 khaledhosny

I have a font that has cursive anchors with RTL flag and without it, based on whether it wants the rightmost of the leftmost glyph to be the one setting on the baseline.

Good point. Thank you!

May 06 '19 13:05 typoman

This also adds the question, do we need a RTL property for anchors?

May 06 '19 13:05 typoman

Is this https://github.com/googlefonts/ufo2ft/issues/303 ?

@khaledhosny, are there any other fundamental problems that are good to consider up front?

May 06 '19 13:05 justvanrossum

No, another font that I didn’t convert to UFO since fontmake does not support cursive anchors at all.

May main problem with UFO anchors right now is that there behavior is unspecified. Tools like fontmake and ufo2ft try to follow Glyphs but 1) Glyphs is not a UFO editor 2) Its behavior is not specified either 3) it has arbitrary limitations (like supporting only one cursive anchor per font).

I prefer being explicit than implicit, so I think type, index and flagsattributes would (potentially) allow the compiler to build any kind of mark positioning lookup supported by OpenType without having to do lots of guesswork. The new attributes should be optional so that people who prefer the current behavior can continue doing so without disruption.

One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.

May 06 '19 14:05 khaledhosny

One thing that would still be unsolved is contextual mark positioning. I have no idea how that would be supported for anchors without full OpenType machinery in the format. So I guess people will have to keep writing that manually or have font-specific scripts to handle them.

This is the main limitation of anchors for me in UFO. A context cannot be written in the name. I would love to have the option to write it in the anchor but there is None atm.

May 06 '19 14:05 typoman

A possible solution for future, adding an attributes dictionary to the anchor with (standardized) string keys?

May 06 '19 16:05 typoman

A solution for now, write all the related info for mark and cursive positioning inside the glyph lib and define the spec. An authoring tool (e.g. RF extension) is required to show the preview, change its attributes and to compile it to features.

May 06 '19 17:05 typoman

Another disadvantage for anchor naming scheme, the intention of the designer is unclear. The anchor naming scheme for generating mark positioning is exactly similar to build precomposed accented glyphs. Is the anchor made to generate mark positioning or just to build precomposed accented glyphs? This could lead to unnecessary extra data in the final binary. An anchor that is used for glyph construction is not necessarily supposed to be used for mark positioning and position of a precomposed mark can be different from a mark that is going to be placed using mark positioning on the actual base glyph (very common in Arabic). One solution, a new data structure only for generating marks and cursive attachment that could be saved in glyph lib and also to segregate these data.

May 06 '19 19:05 typoman

Is the anchor made to generate mark positioning or just to build precomposed accented glyphs?

That's an interesting question.

On the one hand: well, maybe it is cool to have mark features for (latin) accents? On the other: that seems rather accidental, and is not controllable enough.

But anchor.type could be used to distinguish them, if it existed, so this is nice argument in favor of that.

May 07 '19 05:05 justvanrossum

On the one hand: well, maybe it is cool to have mark features for (Latin) accents?

I'm also wondering. In Arabic, mark positioning is needed on any type of letter (whether precomposed or not) but in Latin:

How often letters+accent(s) combinations are typed?
Which layout engines override these combinations and replace it with a precomposed glyph that already exists in the font?

Glyphs app uses the anchor naming scheme to generate mark positioning and also to build composites. There are situations where a composite glyph doesn't have any anchors but Glyphs app burrows mark positioning from baseGlyph(s) and duplicates them in the composite during compile. The user even can't see these virtual marks positionings in the glyph view. The generated mark feature for Latin could become as huge or even bigger than Arabic. I wonder if this huge data could cause other issues? I would appreciate @behdad and/or @anthrotype insight into these questions.

May 07 '19 09:05 typoman

In my experience, mark positioning is not equal to accented composites.

May 07 '19 10:05 typoman

Since I don't know much about Devanagari I would also appreciate @tiroj insights. That would help to address any current issues with anchor definition or mark positioning in UFO.

May 07 '19 10:05 typoman

ufo-spec ufo-spec copied to clipboard

Better anchor definition

ufo-spec
ufo-spec copied to clipboard