hyperref
hyperref copied to clipboard
UTF8 in TextField's 'default'
Like https://github.com/ho-tex/hyperref/issues/5, LuaLaTeX produces an incorrect result if you use non-ASCII characters for 'default' in TextField. It seems pdfLaTeX produces the correct result if you use only LATIN1-characters, but this problem occurs on pdfLaTeX too, if you use Japanese characters for 'default'. The following hack fixes this problem. (dank des Kommentars von Frau Fischer in https://github.com/ho-tex/hyperref/issues/5)
\documentclass{article}
\usepackage{fontspec} % ---- for LuaLaTeX
% \usepackage[utf8]{inputenc} % ---- for pdfLaTeX
\usepackage[unicode=true]{hyperref}
\usepackage[T1]{fontenc}
% FIX for 'default'
% See also the fix for 'value' by Frau Fischer
% ( https://github.com/ho-tex/hyperref/issues/5 ).
% I just replaced 'value' with 'default'.
\makeatletter
\define@key{Field}{default}{%
\Hy@pdfstringdef\Fld@default{#1}}
\makeatother
% NOTE:
% This fix IS needed for pdfLaTeX too, when you use Japanese
% characters.
%
% \documentclass{article}
% \usepackage[whole]{bxcjkjatype}
% \usepackage[utf8]{inputenc}
% \begin{document}
% \begin{Form}
% % without fix, fails to compile
% \TextField[name=addr,default=東京]{Address} %Tokyo
% \end{Form}
% \end{document}
\begin{document}
\begin{Form}
% OK, of course
\TextField[name=textfield1]{Address} \\
% OK
\TextField[name=textfield2,value=Köln]{Address} \\
% correct on pdfLaTeX (but incorrect if you use Japanese characters)
% incorrect without FIX on LuaLaTeX
\TextField[name=textfield3,default=München]{Address} \\
% (though this is probably meaningless)
% incorrect (internally) on LuaLaTeX
%
% $ pdftk pr-textfield-default-encoding.pdf dump_data_fields
% ...
% FieldValue: Köln % This is OK, but
% FieldValueDefault: München % Quatsch!
% ...
\TextField[name=textfield4,value=Köln,default=München]{Address}
\end{Form}
\end{document}
Sorry, I have noticed that this fix must be applied ONLY to TextField. This fix causes an improper behavior for ChoiceMenu (both on pdfLaTeX and LuaLaTeX).
\begin{Form}
\ChoiceMenu[radio,name=choice,default=Yes]{TeX User}{Yes,No}
\end{Form}
with "FIX" produces:
$ pdftk pr2.pdf dump_data_fields
---
FieldType: Button
FieldName: choice
FieldFlags: 49152
FieldValue: \376\377\000Y\000e\000s # should be "Yes"
FieldJustification: Left
FieldStateOption: Yes
The fix for the default field is certainly needed. But I don't see a problem with the choice menu. With the option unicode you are forcing everything into UTF16BE, and so Yes is encoded as \376\377\000Y\000e\000s. If you don't like this try \usepackage[pdfencoding=auto]{hyperref} instead.
Thank you for your reply. I thought the FIX should not be applyed for ChoiceMenu, because the following code with FIX does not work as expected.
\documentclass{article}
\usepackage{fontspec} % ---- for LuaLaTeX
% \usepackage[utf8]{inputenc} % ---- for PDFLaTeX
\usepackage[unicode=true]{hyperref}
\usepackage[T1]{fontenc}
\begin{document}
\begin{Form}
% 'Yes' is checked (as expected)
\ChoiceMenu[radio,name=nofix,default=Yes]{TeX User?}{Yes,No}
\makeatletter
\define@key{Field}{default}{%
\Hy@pdfstringdef\Fld@default{#1}}
\makeatother
% 'Yes' is NOT checked
\ChoiceMenu[radio,name=withfix,default=Yes]{TeX User?}{Yes,No}
\end{Form}
\end{document}
But this is caused probably by the inconsistency of Charset (encoding) for FieldValue and FieldStateOption.
So I think I should say now: not only FieldValue but also FieldStateOption should be encoded as UTF16 for ChoiceMenu.
FYI: The results of pdftk.
-
the PDF file which is generated by LuaLaTeX
$ pdftk choice.pdf dump_data_fields
FieldType: Button FieldName: nofix FieldFlags: 49152 FieldValue: Yes FieldJustification: Left FieldStateOption: Yes
FieldType: Button FieldName: withfix FieldFlags: 49152 FieldValue: \376\377\000Y\000e\000s FieldJustification: Left FieldStateOption: Yes
-
After the PDF file is edited with Acrobat Reader DC (checked both "NO"-fields)
$ pdftk choice.pdf dump_data_fields
FieldType: Button FieldName: nofix FieldFlags: 49152 FieldValue: No FieldValue: Yes FieldJustification: Left FieldStateOption: No FieldStateOption: Off FieldStateOption: Yes
FieldType: Button FieldName: withfix FieldFlags: 49152 FieldValue: No FieldValue: \376\377\000Y\000e\000s FieldJustification: Left FieldStateOption: No FieldStateOption: Off FieldStateOption: Yes
I don't know why the FieldValue is duplicated...
I see what you mean. I will look at it but not today.
I think it will in the next version work for umlauts and other chars in T1-encoding, but not japanese - this would imho need extended changes in the font resources.