js-fdf icon indicating copy to clipboard operation
js-fdf copied to clipboard

Unicode issues

Open thomasf1 opened this issue 10 years ago • 7 comments

I´m having problems with Unicode characters that seem to be messed up.

There are some discussions about it and I´ve found guy doing a PHP FDF that seemed to address the issue...

https://github.com/mikehaertl/php-pdftk/blob/master/src/FdfFile.php

thomasf1 avatar Aug 21 '15 20:08 thomasf1

Can you give me an example of some data you are putting in that's coming back weird? It would make it easier to have a use-case where the library is broken.

Also, any idea where in the code that it handles Unicode? At a glance it looks like it's here, but I hardly ever mess with PHP so I'm not 100% sure what I'm looking at.

Reading the comments, it looks like FDF files store their keys and values using UTF-16 (Big Endian), which is probably the problem. Light research indicates that JS uses either, depending on how it's implemented (which may handled differently with Node).

I'm looking around for a way of converting a normal string to UTF-16BE now for JavaScript, see if there's any easy way of fixing this.

Albert-IV avatar Aug 21 '15 21:08 Albert-IV

Looking at it, it looks like punycode handles string conversions, plus it's bundled with Node by default.

Helpful Links: 2ality Post on Unicode in JavaScript punycode Documentation

Albert-IV avatar Aug 21 '15 21:08 Albert-IV

Thanks for looking into it, I´ve been experimenting with iconv-lite to get it converted to UTF-16BE, but it seems that needs to use not strings but Buffers...

thomasf1 avatar Aug 21 '15 22:08 thomasf1

I´ve gotten it to work right with iconv-lite when patching the header manually... Not quite sure how to write out the header in the right way, the current way doesn´t quite work with the Buffers...

thomasf1 avatar Aug 21 '15 22:08 thomasf1

Encoding issues are a bi***... I´ve actually given up, looked around and found the xfdf npm handels it much better... :)

thomasf1 avatar Aug 22 '15 11:08 thomasf1

Cool find - it might be good to merge this project into xfdf one since they're both tiny and similar in purpose (handling different flavours of adobe data)

countable avatar Aug 22 '15 22:08 countable

Great :)...

One thing I stumbled upon while searching for a solution was http://rhaseventh.blogspot.de/2014/04/node-js-pdf-fill-from-fdf-with-utf-16.html - I couldn´t quite get it to write out the header special characters in the right way though... Might help though :)

thomasf1 avatar Aug 24 '15 15:08 thomasf1