opentype.js icon indicating copy to clipboard operation
opentype.js copied to clipboard

Support multi-character emoji

Open forresto opened this issue 2 years ago β€’ 6 comments

This issue with monochrome Noto Emoji is distinct from the color emoji issue (#193).

#338 added support for non-Basic-Multilingual-Plane (BMP) characters, but uses Array.from, which doesn't account for combined emoji.

It seems that Opentype.js has the glyph information needed, but the initial text-to-glyph translation is the issue:

image https://opentype.js.org/glyph-inspector.html

Expected Behavior

Calling notoEmojiFont.draw(context, "πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦") should render image

Current Behavior

Calling notoEmojiFont.draw(context, "πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦") renders image

Possible Solution

  1. If "ccmp" is not supported yet and would cover this, this issue can be closed as a duplicate of https://github.com/opentypejs/opentype.js/issues/443.

  2. Intl.Segmenter is a native solution, but isn't supported by Firefox yet.

const splitSegmentArray = (string) => Array.from(new Intl.Segmenter().segment(string)).map(x => x.segment);
console.log(splitSegmentArray("πŸ˜…πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ’–πŸ‘©β€πŸ’»πŸ’”πŸ‘©β€πŸŒΎπŸ§‘πŸ‘¨πŸ½β€πŸŒΎπŸ’œπŸ––πŸΎπŸŒˆ"))
  1. graphemer is a library-based solution. (It is a fairly big library.)

  2. twemoji-parser is focused on parsing emoji sequences, so it's smaller than graphemer.

Steps to Reproduce (for bugs)

Live demo: https://gm69qn.csb.app

  1. Call notoEmojiFont.stringToGlyphs("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦") and get glyphs for "πŸ‘¨πŸ‘©πŸ‘§πŸ‘¦" interspersed with the combiner ("uni200D") instead of the one glyph for the combined family.

image

  1. Same for other combined emoji, like πŸ‘©β€πŸ’», πŸ‘©β€πŸŒΎ, πŸ‘¨πŸ½β€πŸŒΎ, πŸ––πŸΎ

Context

We're adding support for emoji to Cuttle CAD, which can render various fonts as vectors for laser cutting, etc.

Your Environment

  • Version used: 1.3.4
  • Font used: Noto Emoji (ttf)
  • Browser Name and version: Various tested
  • Operating System and version (desktop or mobile): Mac OS desktop
  • Link to your project: https://gm69qn.csb.app

forresto avatar May 02 '22 11:05 forresto

It seems like font.tables.gsub has the ligatureSets info needed to combine these. Is that something that I can enable with an option? image

notoEmojiFont.substitution.getFeature("ccmp") // Array(3640)

The feature tag is "ccmp" ... I'm not seeing that called with defaults via getFeature or getMultiple, though there are some tests. πŸ€”

If "ccmp" is not supported yet, this can be closed as a duplicate of #443.

forresto avatar May 03 '22 10:05 forresto

Looking at #443 I thought this was worth a try:

notoEmojiFont.substitution.add(
  "ccmp", 
  notoEmojiFont.substitution.getFeature('ccmp')
);

but got:

Error: Ligature: unable to modify coverage table format 2

forresto avatar May 09 '22 09:05 forresto

In addition to the ccmp substitutions, https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block) need to be taken into account. For example, "☠" vs "☠️".

forresto avatar May 11 '22 14:05 forresto

Im also looking for a workaround for this. It would be nice to support it or have workaround?

jamesjoung avatar May 16 '22 05:05 jamesjoung

Here's my workaround.

// Opentype.js doesn't actually support these substitutions, so we'll have to
// search them manually
const substitutions = font.substitution.getFeature("ccmp");

function emojiToGlyph (emojiString) {

const glyphs = font
  .stringToGlyphs(emojiString)
  // Discarding these makes the substitution search work for emoji sequences
  // with variation selectors
  // https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)
  .filter((glyph) => glyph.index <= 1850);
let glyph;
if (glyphs.length === 1) {
  glyph = glyphs[0];
} else if (glyphs.length > 1) {
  const indexes = glyphs.map((glyph) => glyph.index);
  const sub = substitutions.find((substitution) => equals(substitution.sub, indexes));
  if (sub) {
    glyph = font.glyphs.get(sub.by);
  }
}
if (glyph) {
  return glyph;
} else {
  throw new Error(`${emojiString} - couldn't find a glyph :(`);
}

}

emojiToGlyph("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦");
/** Custom equals function that can also check lists. */
function equals(a, b) {
  if (a === b) {
    return true;
  } else if (Array.isArray(a) && Array.isArray(b)) {
    if (a.length !== b.length) {
      return false;
    }
    for (let i = 0; i < a.length; i += 1) {
      if (!equals(a[i], b[i])) {
        return false;
      }
    }
    return true;
  } else {
    return false;
  }
}

Caveats:

This only works for one emoji. To replace the glyphs in an arbitrary string, we would also need tokenizer logic.

Only tested with Noto Emoji.

image

forresto avatar May 16 '22 18:05 forresto

here's the different options: https://medium.com/making-faces-and-other-emoji/emoji-fonts-technically-40f3fdc0869e I'd recommend at least supporting COLR/CPAL as it's probably the most widely supported one and one of the most implemented in fonts. It would also probably be a good idea to implement CBDT/CBLC support as well.

ILOVEPIE avatar Nov 20 '22 01:11 ILOVEPIE

ccmp looks like an enforcement feature. It's not display in feature list, but always runs before decode a text. image https://learn.microsoft.com/en-us/typography/script-development/standard Maybe we can add a preprocessing process in Font.stringToGlyphs() ?

TonyJR avatar Mar 18 '24 09:03 TonyJR