opentype.js
opentype.js copied to clipboard
Support multi-character emoji
This issue with monochrome Noto Emoji is distinct from the color emoji issue (#193).
#338 added support for non-Basic-Multilingual-Plane (BMP) characters, but uses Array.from
, which doesn't account for combined emoji.
It seems that Opentype.js has the glyph information needed, but the initial text-to-glyph translation is the issue:
https://opentype.js.org/glyph-inspector.html
Expected Behavior
Calling notoEmojiFont.draw(context, "π¨βπ©βπ§βπ¦")
should render
Current Behavior
Calling notoEmojiFont.draw(context, "π¨βπ©βπ§βπ¦")
renders
Possible Solution
-
If "ccmp" is not supported yet and would cover this, this issue can be closed as a duplicate of https://github.com/opentypejs/opentype.js/issues/443.
-
Intl.Segmenter is a native solution, but isn't supported by Firefox yet.
const splitSegmentArray = (string) => Array.from(new Intl.Segmenter().segment(string)).map(x => x.segment);
console.log(splitSegmentArray("π
π¨βπ©βπ§βπ¦ππ©βπ»ππ©βπΎπ§‘π¨π½βπΎπππΎπ"))
-
graphemer is a library-based solution. (It is a fairly big library.)
-
twemoji-parser is focused on parsing emoji sequences, so it's smaller than graphemer.
Steps to Reproduce (for bugs)
Live demo: https://gm69qn.csb.app
- Call
notoEmojiFont.stringToGlyphs("π¨βπ©βπ§βπ¦")
and get glyphs for "π¨π©π§π¦" interspersed with the combiner ("uni200D"
) instead of the one glyph for the combined family.
- Same for other combined emoji, like π©βπ», π©βπΎ, π¨π½βπΎ, ππΎ
Context
We're adding support for emoji to Cuttle CAD, which can render various fonts as vectors for laser cutting, etc.
Your Environment
- Version used: 1.3.4
- Font used: Noto Emoji (ttf)
- Browser Name and version: Various tested
- Operating System and version (desktop or mobile): Mac OS desktop
- Link to your project: https://gm69qn.csb.app
It seems like font.tables.gsub
has the ligatureSets
info needed to combine these. Is that something that I can enable with an option?
notoEmojiFont.substitution.getFeature("ccmp") // Array(3640)
The feature tag is "ccmp"
... I'm not seeing that called with defaults via getFeature
or getMultiple
, though there are some tests. π€
If "ccmp"
is not supported yet, this can be closed as a duplicate of #443.
Looking at #443 I thought this was worth a try:
notoEmojiFont.substitution.add(
"ccmp",
notoEmojiFont.substitution.getFeature('ccmp')
);
but got:
Error: Ligature: unable to modify coverage table format 2
In addition to the ccmp
substitutions, https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block) need to be taken into account. For example, "β "
vs "β οΈ"
.
Im also looking for a workaround for this. It would be nice to support it or have workaround?
Here's my workaround.
// Opentype.js doesn't actually support these substitutions, so we'll have to
// search them manually
const substitutions = font.substitution.getFeature("ccmp");
function emojiToGlyph (emojiString) {
const glyphs = font
.stringToGlyphs(emojiString)
// Discarding these makes the substitution search work for emoji sequences
// with variation selectors
// https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)
.filter((glyph) => glyph.index <= 1850);
let glyph;
if (glyphs.length === 1) {
glyph = glyphs[0];
} else if (glyphs.length > 1) {
const indexes = glyphs.map((glyph) => glyph.index);
const sub = substitutions.find((substitution) => equals(substitution.sub, indexes));
if (sub) {
glyph = font.glyphs.get(sub.by);
}
}
if (glyph) {
return glyph;
} else {
throw new Error(`${emojiString} - couldn't find a glyph :(`);
}
}
emojiToGlyph("π¨βπ©βπ§βπ¦");
/** Custom equals function that can also check lists. */
function equals(a, b) {
if (a === b) {
return true;
} else if (Array.isArray(a) && Array.isArray(b)) {
if (a.length !== b.length) {
return false;
}
for (let i = 0; i < a.length; i += 1) {
if (!equals(a[i], b[i])) {
return false;
}
}
return true;
} else {
return false;
}
}
Caveats:
This only works for one emoji. To replace the glyphs in an arbitrary string, we would also need tokenizer logic.
Only tested with Noto Emoji.
here's the different options: https://medium.com/making-faces-and-other-emoji/emoji-fonts-technically-40f3fdc0869e I'd recommend at least supporting COLR/CPAL as it's probably the most widely supported one and one of the most implemented in fonts. It would also probably be a good idea to implement CBDT/CBLC support as well.
ccmp looks like an enforcement feature. It's not display in feature list, but always runs before decode a text.
https://learn.microsoft.com/en-us/typography/script-development/standard
Maybe we can add a preprocessing process in Font.stringToGlyphs() ?