scfbuild icon indicating copy to clipboard operation
scfbuild copied to clipboard

Ligation without ZWJ and VS16

Open Crissov opened this issue 8 years ago • 5 comments

With standardized emoji sequences, the author is responsible for the correct order of emoji characters, possibly mandatory variation selectors and zero-width joiners. For most cases and emoji input GUIs, this just works, though. If these sequences are not handled on the system level, Opentype fonts probably employ the rlig ‘required ligatures’ feature which is enabled (or enforced) by default.

It may be useful sometimes, for a reader or designer to control (additional) ligation. This would be handled by the liga or clig ‘contextual ligatures’ OTF features which can be enabled on demand. Possible use cases: ligature aliases, e.g. Woman+Man+Child = Man+Woman+Child = Child+Man+Woman = Woman+Child+Man …, and non-standard ligatures, e.g. Police Car 🚓 + Woman 👩 = Police Woman = Police Officer 👮 + Female Sign ♀️. #5 should be fixed first, of course.

Crissov avatar Oct 17 '16 13:10 Crissov

Interesting! How do you imagine it working? Options in the YAML?

13rac1 avatar Oct 17 '16 18:10 13rac1

.fea files maybe, which can be reused if glyph names are the same.

Crissov avatar Oct 18 '16 05:10 Crissov

Hmm... .fea files are the sort of thing I'd rather leave to other tools such as afdko. My goal is a tool a graphic artist can understand without learning all about fonts. I know that's blasphemy as far as many people are concerned... Haha!

There's gotta be a way to support more advanced features without re-implementing afdko?

13rac1 avatar Oct 18 '16 06:10 13rac1

Since this would need only a very limited subset of Adobe’s .fea syntax (which Fontforge also supports), it could probably be done in the YAML as well, so the tool could build a (virtual) features file which it can then feed to the libs.

sub left-glyph right-glyph by ligature-glyph;

OTF substitution is not based on code-points but glyph names. I haven’t checked yet whether fonts generated by scfbuild and system emoji fonts use (perhaps even the same) systematic glyph names based upon code-points (as suggested by Adobe: uni…). The source image files are usually named by code-point(s) (at least in Emojione, Twemoji and – differently – Noto Emoji, but not Emojidex) and the directory name determines their kind (monochrome or colorful, SVG or PNG, size). That means, that scfbuild is able to hide the distinction between files, glyphs and code-points from its users (graphic artists or other) in many cases, but it needs the mapping information somewhere. Keeping it in the file name probably lead to #5 and the solution to it should be very similar to the solution of this #11. If possible within YAML, I suggest a simple syntax employing -> or =:

alias -> original
alias -> original -> #code-point
left right -> ligature
custom-file-name -> glyph-name
custom-file-name -> #code-point

Crissov avatar Oct 19 '16 15:10 Crissov

The place to fix this seems to be codepoint_from_filepath() in util.py. It currently only supports file names that are either a single hexadecimal code point or a string of multiple code points oncatenated by hyphens. This format is used by Emojione and Twemoji.

codepoints[] = re.match("([\da-f]{4,5})(?:-([\da-f]{4,5}))*", filename) # Emojione, Twemoji

Google Noto uses a more verbose naming convention, including a prefixed emoji_u and an underscore as glue character.

codepoints[] = re.match("emoji(?:_u?([\da-f]{4,5}))+", filename) # Noto

These code points can then be converted to de-facto standard Adobe glyph names and a cmap table can be build accordingly.

Emojidex, however, uses descriptive file names which can be used as glyph names, but require some kind of lookup or heuristics to correctly match these with Unicode code points. They could be hard-coded, e.g. based upon short names and character names or annotation, but a user-defined map as suggested above with custom-file-name -> #code-point is probably the better approach . Emojidex also contains animated glyphs whose frames reside in a sub-folder – following the same naming scheme as .svg files – together with an animation.json. It also features non-standard emojis that would require a custom mapping to PUA codes or ligatures anyway.

 (glyphname, variant) = re.match("([A-Za-z0-9_]+)(\([a-z]+\))?", filename) # Emojidex

Pseudo-ligatures with variation selectors VS-15/TVS U+FE0E and VS-16/EVS U+FE0F could be added automatically based upon current conventions documented in UTR#51 and custom extensions.

I don’t know anything about Python, so treat the above as pseudo-code.

Crissov avatar Dec 04 '16 09:12 Crissov