alreq icon indicating copy to clipboard operation
alreq copied to clipboard

Arabic Hamza bellow with Kasra break fonts.

Open ntounsi opened this issue 8 years ago • 15 comments

Some fonts don't display both the Hamza Bellow (a letter) and the Kasra bellow it. (a) hamzacgjkasra

The Kasra is put above the Hamza. Bad. (b) hamzakasra

Even if you permute Hamza and Kasra.

Example :
<p> &#x0642;&#x0627;&#x0649;&#x0655;&#x0650;&#x062F;<p> قاىِٕد

A work arround is to put th CGJ (034F COMBINING GRAPHEME JOINER) between Hamza and Kasra <p> &#x0642;&#x0627;&#x0649;&#x0655;&#x034F;&#x0650;&#x062F;<p> قاىٕ͏ِد

This works with some basic fonts (e.g. Arial), but most fonts break it giving (d) hamzacgjkasrabroken

Question : what is the right rendering of this example (a), (b) or (d) ? Or which font is broken, the one that puts the Kasra above Hamza (b) or the one which break the joining as in (d)? Knowing that Arabic Hamza bellow is not of large use in Arabic. When Hamza is to be prononced (’i), it is put ABOVE the letter Yeh ئ and coded (U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE). Thus, the above word (qā’id) is actually (e) hamzanormalabove

ntounsi avatar Oct 12 '17 11:10 ntounsi

(b) is definitely a bug, most likely a font bug, can you tell me which font is that?

(d) might be a browser bug, my guess is that the browser finds that the font does not have a glyph for CGJ and (incorrectly) uses a fallback font for it which breaks shaping. It might be interesting to know which browser and font combination fails here.

khaledhosny avatar Oct 13 '17 02:10 khaledhosny

screen shot 2017-10-13 at 11 30 02

Firefox on left, Chrome on right, font Amiri, OS Mac X 10.11.6

(first two lines have diacritics in opposite orders in memory; last line has CGJ)

r12a avatar Oct 13 '17 10:10 r12a

This link should replicate the above: http://r12a.github.io/pickers/arabic/?text=%D9%82%D8%A7%D9%89%D9%95%D9%90%D8%AF%0A%D9%82%D8%A7%D9%89%D9%90%D9%95%D8%AF%0A%D9%82%D8%A7%D9%89%D9%95%CD%8F%D9%90%D8%AF

r12a avatar Oct 13 '17 10:10 r12a

Looks like another case of Arabic mark mis-reordering during normalization. HarfBuzz master implements UAOA and the 3 cases render the same with hb-view tools, so it should be fixed once Firefox and Chrome upgrade.

The CGJ issue in Firefox is most likely the font fallback bug I suspected earlier, as it does not show in Chrome which uses a different font fallback strategy. Reported it: https://bugzilla.mozilla.org/show_bug.cgi?id=1408366

khaledhosny avatar Oct 13 '17 12:10 khaledhosny

The CGJ issue is now fixed in Firefox Nightly: image

khaledhosny avatar Oct 14 '17 18:10 khaledhosny

Wow, that was quick! Thanks.

r12a avatar Oct 15 '17 07:10 r12a

Thanks to @jfkthame for the quick fix.

khaledhosny avatar Oct 15 '17 08:10 khaledhosny

How does this interact with Unicode's new guidance on reordering Arabic as first stage of the display pipeline? http://unicode.org/reports/tr53

asmusf avatar Oct 16 '17 03:10 asmusf

It should mean that the method proposed in UTR#53 of overriding the repositioning of diacritics will work on all major browsers on Mac OS X (i haven't tested on Windows).

r12a avatar Oct 16 '17 09:10 r12a

The other two cases are fixed in Firefox Nightly as well, so all the three render the same now: screenshot-2017-10-19 arabic character picker 20

khaledhosny avatar Oct 19 '17 14:10 khaledhosny

The other two cases are fixed in Firefox Nightly as well, so all the three render the same now:

But I broke the third in HarfBuzz master by not skipping CGJ, right? I'll try to fix that soon.

behdad avatar Oct 19 '17 17:10 behdad

@behdad actually that @khaledhosny's result may be correct, since the third one (when using the above link) has

U+0642 ARABIC LETTER QAF U+0627 ARABIC LETTER ALEF U+0649 ARABIC LETTER ALEF MAKSURA U+0655 ARABIC HAMZA BELOW​ U+034F COMBINING GRAPHEME JOINER​ U+0650 ARABIC KASRA​ U+062F ARABIC LETTER DAL

I think this might be a better test case: https://r12a.github.io/pickers/arabic/?text=%D9%82%D8%A7%D9%89%D9%90%D9%95%D8%AF%0A%D9%82%D8%A7%D9%89%D9%90%D9%95%D8%AF%0A%D9%82%D8%A7%D9%89%D9%90%CD%8F%D9%94%D8%AF

It replaces hamza cgj kasra with kasra cgj hamza in the third line.

r12a avatar Oct 19 '17 17:10 r12a

Here is some screen shots of result tests on how browsers show the same example in some common fonts.

Firefox on Mac firefoxmac2

Chrome on Mac chromemac2

Opera on Mac operamac2

Safari (8.0.3, not latest version) safarimac2

Edge on Windows edgewin2

Default fonts should be : Times new Roman on FF Times on Safari Geeza Pro (for Arabic part) on Chrome/Opera

What is noticeable is that a modern font like Droid Naskh, and more basic fonts (default and courier new) give good result in almost all browsers, with or without CGJ.

ntounsi avatar Oct 19 '17 22:10 ntounsi

Droid Arabic probably has a hamza+kasra ligature and maps both <hamza><kasra> and <kasra><hamza> to it, so the order of the marks does not matter in this case.

khaledhosny avatar Oct 20 '17 18:10 khaledhosny

I put up a quick test for one of the examples mentioned in UTR 53, which is relevant here. See https://w3c.github.io/i18n-tests/utr53/exp-ar-positioning-000.html

r12a avatar Mar 08 '19 16:03 r12a