sapp icon indicating copy to clipboard operation
sapp copied to clipboard

UTF-8 Support

Open krupong opened this issue 1 year ago • 10 comments

I am testing the signature -> set_metadata_props feature , but it's not show correctly. My signing reason is "ทดสอบ"

Screenshot_20240729_143316

Is it support utf-8 encoding? Thank you.

krupong avatar Jul 29 '24 07:07 krupong

Try #79

erikn69 avatar Jul 29 '24 17:07 erikn69

Hello,I've try

Try #79

It's truncate some character such as "ภาษาไทย" will return "ภา".

ภาพ

So I've change from :

return "\xFE\xFF" . mb_convert_encoding($string, 'UTF-16BE', $encoding);

TO :

return "\xEF\xBB\xBF".mb_convert_encoding($string, 'UTF-8', $encoding);

It's show correctly. ภาพ

Thank you.

krupong avatar Jul 30 '24 12:07 krupong

So I've change from : return "\xFE\xFF" . mb_convert_encoding($string, 'UTF-16BE', $encoding); TO : return "\xEF\xBB\xBF".mb_convert_encoding($string, 'UTF-8', $encoding);

with that change I get this

image

erikn69 avatar Jul 30 '24 14:07 erikn69

What about using a custom encoded string when setting the metadata?

dealfonso avatar Jul 30 '24 14:07 dealfonso

What about using a custom encoded string when setting the metadata?

That would work, but there would be the problem that every time someone doesn't know that they should do their own encoding, they will have problems and open a new issue.

erikn69 avatar Jul 30 '24 14:07 erikn69

@dealfonso One question, if the file says ANSI in the encoding, and the reason is in UTF-8 or another encoding, wouldn't this problem occur?

Look, I sent UTF-8 and it doesn't work

/Reason(ภาษาไทย)/Location(sdfs ó í í)>>

But I did send ISO-8859-1

/Reason(ó í í {} ` ~)/Location(sdfs ó í í)>>

erikn69 avatar Jul 30 '24 14:07 erikn69

Honestly, I have not considered this topic before. A quick search on google [1] tells me that PDF seems not to consider character encoding in a general form. It considers that the encoding depends on the font, and depending on the font, the same character will show a representation or another.

I don't know how this applies to the reason and so on.

That is why my "quick answer" is that pdf does not support utf-8 and so the users needs to encode the characters depending on their needs.

I'll read more about character encoding in the metadata. Do you have any source of info to read?

https://www.gnostice.com/nl_article.asp?id=383&t=Font_and_Encoding_Standard_types_supported_in_PDF_for_the_representation_of_text_content

dealfonso avatar Jul 30 '24 15:07 dealfonso

It considers that the encoding depends on the font, and depending on the font, the same character will show a representation or another

But on text contents, metadata don't use fonts

erikn69 avatar Jul 30 '24 15:07 erikn69

I did try FPDF, and it works with UTF-8,

/Keywords (þÿ Ì + ^ ì ò Ò ê)

But here doesn't work https://github.com/Setasign/FPDF/blob/0838e0ee4925716fcbbc50ad9e1799b5edfae0a0/fpdf.php#L1169C1-L1189C2

erikn69 avatar Jul 30 '24 16:07 erikn69

I try to sign with TCPDF, It work with UTF-8 too. When open in VS-Code :

ภาพถ่ายหน้าจอ 2567-07-31 เวลา 11 24 39

Sign with sapp, seem store as plain text : ภาพถ่ายหน้าจอ 2567-07-31 เวลา 11 30 07

krupong avatar Jul 31 '24 04:07 krupong

I try [#79 ] by encode metadata to UTF-16BE with BOM, everything is OK.

Problem is, when I set string contain "\x0E\x28 " (ศ) or "\x0E\x29" (ษ) metadata is broken. I think that when compile to PDF, there will be "(" or ")" in compiled character, causing the incorrect display.

For example, I set my string to "ภาษาไทย". ภาพ

Will show like this: ภาพ

I do dirty fixed by adding "(" or ")" at the beginning or end of string.

So I add "(" to beginning of "ภาษาไทย" like this -> "(ภาษาไทย". ภาพ

Then my signature show like this: ภาพ

It display the text correctly but still has "(" in front of it.

Is there a correct way to deal with this problem?

Thank you.

krupong avatar Sep 21 '24 04:09 krupong

Feel free to make a PR with the fix 👍

angeljqv avatar Sep 21 '24 12:09 angeljqv

Could you please check #84

Thank you.

krupong avatar Sep 21 '24 18:09 krupong

merged

dealfonso avatar Nov 08 '24 09:11 dealfonso