PDF-Writer icon indicating copy to clipboard operation
PDF-Writer copied to clipboard

Arabic text is not mapped to correct glyphs

Open nashwaan opened this issue 7 years ago • 13 comments

I am trying to write simple Arabic text consisting of 3 consecutive characters of same Unicode code point (letter Ain U+0639, which can be represented by 4 glyphs depending on its position in the word).

pageContentContext->WriteText(50, 200, u8"ععع", textOptions);

Also tried to hard code the unicode text as utf-8 U+0639 -> \xD8\xB9

pageContentContext->WriteText(50, 100, "\xD8\xB9\xD8\xB9\xD8\xB9", textOptions);

But, the output in PDF is shown as: ﻉﻉﻉ The correct output should be: ععع

Is Unicode to Glyph mapping is not working correctly or am I missing something here?

nashwaan avatar Jul 18 '16 10:07 nashwaan

you got it just right. hummus uses a simple glyph mapping which does not take into account script arabic considerations (and then there's right to left). There was a discussion on how to resolve this in https://github.com/galkahana/HummusJS/issues/56. you can take the solution we figured from there.

galkahana avatar Jul 18 '16 11:07 galkahana

Thanks for your quick reply.

The solution pointed by @hussasad looks promising albiet it is written in Javascript. Need to figure out how to use C++ version of that.

nashwaan avatar Jul 18 '16 11:07 nashwaan

@nashwaan can you provide a fully working cpp sample that shows the problem, i'm back on tackling this issue.

amrnablus avatar Aug 14 '16 01:08 amrnablus

I'm having the same issue ,is it fixed?

reemshahban avatar Aug 09 '18 13:08 reemshahban

@reemshahban It's fixed on my fork but it's not been merged. If I remember correctly @galkahana said it needs to be tested on Windows before we can merge. Would you be able to do that?

amrnablus avatar Aug 09 '18 14:08 amrnablus

@amrnablus I tried to test the code of your fork on Windows but had some issues with PDFWriter code itself (could not get Arabic text to show), since those issues seem to be fixed on the master code, I merged your fixTextDirection(const std::string& inText, const std::string& charset) with the code that uses fribidi only and it works as expected as you can see bellow (@amrnablus code vs original code). Not sure if you want me to test something else. good bad

MarcoMartins86 avatar Sep 25 '18 14:09 MarcoMartins86

Looks good to me but @galkahana has some concerns on using fribidi only, would you be able to test icu as well?

amrnablus avatar Sep 25 '18 18:09 amrnablus

Did it now, same results, so both of them are ok.

Edit: maybe it is important to mention that I built all libraries using Visual Studio 2013 and not MinGW

MarcoMartins86 avatar Sep 26 '18 10:09 MarcoMartins86

Sounds good, tbh i have no experience whatsoever with building on windows so i can't really comment, best get a review from @galkahana and then possibly merge. Thanks @MarcoMartins86

amrnablus avatar Sep 26 '18 18:09 amrnablus

Hi. I'm using this code and I noticed two mem leaks in it. One is at line 65 of UnicodeTextUtils.cpp (visual_str is never freed). The other one is deeper in the C code and I didn't have the time to pinpoint it, but it seemed to have something to do with the list manipulations in fribidi-run.c.

rcosteira79 avatar Mar 18 '19 16:03 rcosteira79

Ah! thanks! Can you provide a sample code that causes the memory leak?

amrnablus avatar Mar 18 '19 16:03 amrnablus

Well, I can't provide a sample right now, but it seems to happen when you call the AbstractContentContext::fixTextDirection function.

rcosteira79 avatar Mar 18 '19 17:03 rcosteira79

Ok i'll try to take a look this weekend, thanks!

amrnablus avatar Mar 20 '19 14:03 amrnablus