harfbuzz icon indicating copy to clipboard operation
harfbuzz copied to clipboard

Lookup flags in fonts with no GDEF table

Open khaledhosny opened this issue 5 years ago • 35 comments

Someone reports an issue in LibreOffice about some Iranian fonts like B Nazanin (fonts attached in the LIbreOffice issue).

Basically the font has mark ligatures for shadda and the lookups has ignore base and ignore ligatures set (which makes no sense), which causes HarfBuzz to correctly skip bases and make ligature between shadda and fathatan in strings like:

$ hb-view "B Nazanin.TTF"  اتّفاقاً

HarfBuzz Oddly enough, both Uniscribe and Core Text ignore the lookup flags and don’t for the ligatures.

$ hb-view "B Nazanin.TTF"  اتّفاقاً --shaper=coretext

Core Text

As it turns out, the font has no GDEF table, so HarfBuzz synthesizes one and use it here.

It is a rather odd font, and I HarfBuzz synthesis of a GDEF table actually matches what the spec is saying (emphasis mine):

In addition, a client uses class definitions to apply GSUB and GPOS LookupFlag data correctly. For example, a LookupFlag may specify ignoring ligatures and marks during a glyph operation. If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

But it might be an interoperability issue that we might want to consider (my 2¢ is to 🤷🏽 and declare it working as designed).

khaledhosny avatar Aug 19 '20 00:08 khaledhosny

Sounds related to the GDEF change we made and reverted. The carrying-forward of the flags is wrong... Anyway. I'll check later. This was just me speculating based on the title.

behdad avatar Aug 19 '20 00:08 behdad

That is with 2.7.1 where the offending commit is reverted, unless you mean the old behavior was also wrong.

khaledhosny avatar Aug 19 '20 00:08 khaledhosny

Hmm, if I add a GDEF table with glyph classes, HarfBuzz stops making the ligatures: BNazanin#2.zip

khaledhosny avatar Aug 19 '20 00:08 khaledhosny

(Core Text and Uniscribe don’t make the ligatures when GDEF table is present as well, so my initial guess was wrong)

khaledhosny avatar Aug 19 '20 00:08 khaledhosny

unless you mean the old behavior was also wrong.

Yes.

behdad avatar Aug 19 '20 00:08 behdad

Simon found problems with pre-2.7.0 HB-no-GDEF. We tried to address, broke others.

Even then there's more issues. I'll try to write them down here soon.

behdad avatar Aug 19 '20 00:08 behdad

Let's revive this issue and figure out what needs to be fixed.

behdad avatar Jun 14 '21 21:06 behdad

Fixed by https://github.com/harfbuzz/harfbuzz/pull/3365 ?

behdad avatar Jan 19 '22 16:01 behdad

Does not seem so.

khaledhosny avatar Jan 19 '22 21:01 khaledhosny

Can you send me the font in question please?

behdad avatar Jan 19 '22 21:01 behdad

Can you send me the font in question please?

Nevermind. Got it.

behdad avatar Jan 19 '22 21:01 behdad

I think I know what to do.

behdad avatar Jan 19 '22 21:01 behdad

Basically, what they are doing, I suppose, is that when matching ligatures (and context, etc) other than the first item, ignore IgnoreBase and IgnoreLigature... lol. We do the same during mark-attachment lookups.

behdad avatar Jan 19 '22 21:01 behdad

If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

Do we actually have evidence that Uniscribe does this?

behdad avatar Jun 25 '22 19:06 behdad

I'm leaning towards closing this.

behdad avatar Jun 25 '22 19:06 behdad

I have no more information to add and since apparently it is only this font and IMHO that is a badly built one, lets close.

khaledhosny avatar Jun 26 '22 07:06 khaledhosny

This problem is not limited to B Nazanin. I have the same problem with Arial and Tahoma:

$ hb-view /usr/share/fonts/truetype/msttcorefonts/Arial.ttf  اتّفاقاً
## PROBLEMATIC OUTPUT HERE ##

$ hb-view --version
hb-view (HarfBuzz) 2.7.4
Available shapers: graphite2,ot,fallback

I am using Ubuntu 22.04.

The same problem is visible with Arial and the latest version of harfbuzz:

$ ./util/hb-view /usr/share/fonts/truetype/msttcorefonts/Times_New_Roman.ttf اتّفاقاً
## PROBLEMATIC OUTPUT HERE ##

$ ./util/hb-view --version
hb-view (HarfBuzz) 4.4.1
Available shapers: ot,fallback

hosseinn avatar Jul 17 '22 13:07 hosseinn

This is not the same issue. The old corefonts are broken, I get the same failure in with DirectWrite as well. Newer versions of the fonts are fine.

khaledhosny avatar Jul 17 '22 14:07 khaledhosny

To be clear, the fonts have ligature lookups for the mark ligatures that set both ignore base and ignore ligatures lookup flags, so HarfBuzz is doing what the fonts asked for.

khaledhosny avatar Jul 17 '22 14:07 khaledhosny

So the difference is that Arial has a GDEF table, but B Nazanin does not. So again I think the issue is that HarfBuzz is synthesizing glyph classes when GDEF table is absent and DirectWrite is apparently not doing that.

khaledhosny avatar Jul 17 '22 15:07 khaledhosny

Re-opening to see if we actually want to follow MS implementation here or not.

khaledhosny avatar Jul 17 '22 15:07 khaledhosny

If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

Do we actually have evidence that Uniscribe does this?

Apparently the answer is no, it does not do this.

khaledhosny avatar Jul 17 '22 15:07 khaledhosny

With:

diff --git a/src/hb-ot-shape.cc b/src/hb-ot-shape.cc
index 0806abb7d..29c2ec2f9 100644
--- a/src/hb-ot-shape.cc
+++ b/src/hb-ot-shape.cc
@@ -137,8 +137,8 @@ hb_ot_shape_planner_t::compile (hb_ot_shape_plan_t           &plan,
    * Decide who provides glyph classes. GDEF or Unicode.
    */

-  if (!hb_ot_layout_has_glyph_classes (face))
-    plan.fallback_glyph_classes = true;
+  //if (!hb_ot_layout_has_glyph_classes (face))
+  //  plan.fallback_glyph_classes = true;

   /*
    * Decide who does substitutions. GSUB, morx, or fallback.

The B* fonts stop making the incorrect mark ligature.

khaledhosny avatar Jul 17 '22 15:07 khaledhosny

If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

Do we actually have evidence that Uniscribe does this?

Apparently the answer is no, it does not do this.

I opened https://github.com/MicrosoftDocs/typography-issues/issues/954 for spec clarification.

khaledhosny avatar Jul 17 '22 15:07 khaledhosny

If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

Do we actually have evidence that Uniscribe does this?

Apparently the answer is no, it does not do this.

From the Uniscribe API limitations also I have deduced that they can't be doing this.

behdad avatar Jul 18 '22 02:07 behdad

If the font does not include a GlyphClassDef table, the client must define and maintain this information when using the GSUB and GPOS tables.

Do we actually have evidence that Uniscribe does this?

Apparently the answer is no, it does not do this.

From the Uniscribe API limitations also I have deduced that they can't be doing this.

At least not for positioning.

behdad avatar Jul 18 '22 02:07 behdad

I'm thinking about blacklisting the GPOS of these fonts.

Where can we find them all?

behdad avatar Jul 30 '22 17:07 behdad

This problem is not limited to B Nazanin. I have the same problem with Arial and Tahoma:

FWIW we intentionally nuke the GDEF of these fonts, up to Windows 8. Wonder if we should do the same to their GPOS.

behdad avatar Jul 30 '22 17:07 behdad

This is one way to handle this: Only synthesize glyph classes if font doesn't have GDEF and doesn't have GPOS.

diff --git a/src/hb-ot-shape.cc b/src/hb-ot-shape.cc
index 3da317a5c..5c68bee21 100644
--- a/src/hb-ot-shape.cc
+++ b/src/hb-ot-shape.cc
@@ -137,7 +137,8 @@ hb_ot_shape_planner_t::compile (hb_ot_shape_plan_t           &plan,
    * Decide who provides glyph classes. GDEF or Unicode.
    */
 
-  if (!hb_ot_layout_has_glyph_classes (face))
+  if (!hb_ot_layout_has_glyph_classes (face) &&
+      !hb_ot_layout_has_positioning (face))
     plan.fallback_glyph_classes = true;
 
   /*

One test fails though. Need to investigate.

I haven't checked that it actually fixes the case at hand.

behdad avatar Jul 30 '22 17:07 behdad

One test fails though. Need to investigate.

One of the failing tests is in Thai and we fail to zero mark advances, I suppose because there is no mark attachment in the GPOS. This change seems fragile at this point in time :(.

behdad avatar Jul 30 '22 17:07 behdad