hyperref
hyperref copied to clipboard
\index{a"|b} is broken but \index{a"@b}, \index{a""b}, \index{a"!b} all work
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{makeidx}
\usepackage{hyperref}
\makeindex
\begin{document}
% \index{a"|b}% error if this is uncommented
\index{a"@b}
\index{a""b}
\index{a"!b}
\printindex
\end{document}

But with the commented line uncommented:
! Argument of \T1\b has an extra }.
<inserted text>
\par
l.6 \item a|hyperindexformat{\b}
, 1
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on testindexhyperref.log.
Procedure:
pdflatex --halt-on-error testindexhyperref
makeindex testindexhyperref
pdflatex --halt-on-error testindexhyperref
I did check hyperref documentation but found nothing on this.
I looked briefly at code source but I don't have much time beyond quick workaround (I see the source deals with presence of packages multind, index, amsmidx which I don't want to have to deal with).
This appears to work (I can submit a PR with better notations; I kept the usage of tests in the spirit of original; the \\ is with normal catcodes in this and original). I leave possible nesting of conditionals, it does not seem important to move out of them.
The idea is very simple, scan for visible "| and replace them by "\barreplacement where the later is un unprotected macro which will give a | in .idx file, and once this is done use original macro which looks for |.
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{makeidx}
\usepackage{hyperref}
\makeindex
% bug comes from this :
% \@wrindex #1->\@@wrindex #1||\\
% \@@wrindex #1|#2|#3\\->\if@filesw \ifx \\#2\\\protected@write \@indexfile {}{\s
% tring \indexentry {#1|hyperpage}{\thepage }}\else \HyInd@@@wrindex {#1}#2\\\fi
% \fi \endgroup \@esphack
\makeatletter
\def\zzzzbarreplacement{|}
\def\@wrindex #1{\zzzz@@wrindex #1"|\\{}}%
% no brace removal possible in #2 (if not empty it ends in "|)
\def\zzzz@@wrindex #1"|#2\\#3{\ifx\\#2\\\@@wrindex #3#1||\\\else
\zzzz@@wrindex #2\\{#3#1"\zzzzbarreplacement}\fi}
\makeatother
\begin{document}
AA
\index{a"|b}% ok now
\index{a"|b"|c@\textbf{a"|b"|c}|emph}% this is also ok
\index{a"@b}
\index{a""b}
\index{a"!b}
\printindex
\end{document}

\indexentry{a"|b|hyperpage}{1}
\indexentry{a"|b"|c@\textbf{a"|b"|c}|hyperindexformat{\emph}}{1}
\indexentry{a"@b|hyperpage}{1}
\indexentry{a""b|hyperpage}{1}
\indexentry{a"!b|hyperpage}{1}
(the above sort of assumes \index is used at top level; if not hence catcodes did not change, and there is a \\ in argument, all is broken but this holds for original code as well)
regarding potential brace removal in #1 they can be prevented by adding \empty token upfront first (and then repeatedly). These tokens will disappear when written out to file.
My naive proposal however breaks \"| case, because it does not take into account that it is " which is now escaped rather than | being quoted.
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{makeidx}
\usepackage{hyperref}
\makeindex
% bug comes from this :
% \@wrindex #1->\@@wrindex #1||\\
% \@@wrindex #1|#2|#3\\->\if@filesw \ifx \\#2\\\protected@write \@indexfile {}{\s
% tring \indexentry {#1|hyperpage}{\thepage }}\else \HyInd@@@wrindex {#1}#2\\\fi
% \fi \endgroup \@esphack
\makeatletter
\def\zzzzbarreplacement{|}
% upstream code is nested in some \lowercase, let's
% delay worry about this to time of actually
% modifying original code rather than patching it
\edef\@wrindex{\noexpand\zzzzzzzz@wrindex{\@backslashchar"|\noexpand\\}}%
% added \empty token disappears when written out
\def\zzzzzzzz@wrindex#1#2{\zzzzzzzz@@wrindex \empty#2#1}%
\expandafter\def\expandafter
\zzzzzzzz@@wrindex\expandafter #\expandafter1\@backslashchar"|#2\\%
{%
\ifx\\#2\\%
% no \"| in input
\zzzz@@wrindex #1"|\\{}%
\else
% this #2 will end in \_{12}"| and we need to remove this
\zzzzzzzzHyInd@@@wrindex #2\\{#1\string\"}%
\fi
}%
\expandafter\def\expandafter
\zzzzzzzzHyInd@@@wrindex\expandafter #\expandafter1\@backslashchar"|\\#2%
{%
% original code would have here an #1 trimmed from any
% extra trailing |<stuff> but we
% don't go the effort of removing it (wrong input syntax)
\if@filesw\HyInd@@@wrindex {#2}#1\\\fi\endgroup \@esphack
}%
% no brace removal possible in #2 and in #1 either due to \empty token
% \@@wrindex is the original unmodified macro
\def\zzzz@@wrindex #1"|#2\\#3%
{%
\ifx\\#2\\%
\@@wrindex #3#1||\\%
\else
% loop, \empty tokens to avoid brace removal
\zzzz@@wrindex \empty#2\\{#3#1"\zzzzbarreplacement}%
\fi
}
\makeatother
\begin{document}
AA
\index{a"|b}% ok now
\index{a"|b"|c@\textbf{a"|b"|c}|emph}% this is also ok
\index{a\"@a\string\"|emph}% ok, we use @ else \" will combine with comma in
% PDF but code above could also be modified
% at location of insertion of \string\"
\index{"|a\"@"|a\string\"}% ok
\index{a"@b}
\index{a""b}
\index{a"!b}
\printindex
\end{document}
\indexentry{a"|b|hyperpage}{1}
\indexentry{a"|b"|c@\textbf{a"|b"|c}|hyperindexformat{\emph}}{1}
\indexentry{a\"@a\string\"|hyperindexformat{\emph}}{1}
\indexentry{"|a\"@"|a\string\"|hyperpage}{1}
\indexentry{a"@b|hyperpage}{1}
\indexentry{a""b|hyperpage}{1}
\indexentry{a"!b|hyperpage}{1}

in my examples I used \index{"|a\"@"|a\string\"}% ok which gives a \" in output, but in fact I should have used in the @ part \textquotedbl:
\index{a"|b}% ok now
\index{a"|b"|c@\textbf{a"|b"|c}|emph}% this is also ok
\index{a\"@a\textquotedbl|emph}% ok, we use @ else \" will combine with comma
\index{"|a\"@"|a\textquotedbl}% ok
\index{a"@b}
\index{a""b}
\index{a"!b}
produces

in my examples I used
\index{"|a\"@"|a\string\"}% okwhich gives a\"in output, but in fact I should have used in the@part\textquotedbl:
in fact I used \string\"| on purpose in the examples to test \"| case so it was deliberate. Of course with \textquotedbl it is not the same code branch which is tested... time to go to bed.
\index{a"|b}% ok now
\index{a"|b"|c@\textbf{a"|b"|c}|emph}% this is also ok
\index{a\"@a\string\"|emph}% ok, we use @ else \" will combine with comma as accent
% the \string\" above is deliberate to test parsing of \"| situation, a sole a\"|emph would test it also
% but the in PDF output we have the \" macro acting on the comma separator with page number
% which is irrelevant but is the reason the test is done as above (of course then the print out
% contains backslash double-quote but the sorting was done on only double-quote)
\index{"|a\"@"|a\textquotedbl}% ok
\index{a"@b}
\index{a""b}
\index{a"!b}
Can't you in your input simply protect the |? E.g. with
\index{a{"|}b}
or
\DeclareRobustCommand\mybar{|} \index{a\mybar b}
?
I'm a bit reluctant to touch this code for a rather rare use case. It is imho rather fragile anyway and I don't know if a change will affect other packages or documents, so probably some research would be needed.
I'm a bit reluctant to touch this code for a rather rare use case. It is imho rather fragile anyway and I don't know if a change will affect other packages or documents, so probably some research would be needed.
I understand that and I am of the opinion that widely used old packages should basically not change because any change can break long established workarounds in documents or templates or automated processors. I once had a discussion with Frank about this regarding the parskip package. But hyperref did not as parskip remain unchanged for 2 decades.
Thus I opened the ticket because the package has a bug, which afaict has not been documented.
The original code looks quite fragile in contexts where the \index macro will not have been able to reset the catcodes.
Can't you in your input simply protect the
|? E.g. with\index{a{"|}b}or
\DeclareRobustCommand\mybar{|} \index{a\mybar b}
?
In my original context we probably will end up doing something like this. It affects the sorting with makeindex, but when using xindy one can add a merge-rule and then obtain a really complete solution. In fact in my original context the | is currently escaped to \textbar{} globally, and I hit against the reported bug when experimenting with removing this escaping (which has a cost).
My original context should not have any \index used in argument of other command (we use environments for footnotes), nor any usage of a package modifying the core \index mechanism (apart from hyperref) but it has usage of tabulary which gathers its contents. For this reason only something robust or non expanding as in your proposals should probably be used and it would be to guarantee makeindex will not be given two distinct things for the same document input, depending on its location in document. Again with xindy, merge rules can be used to work around these problems. Perhaps nowadays \protected is the way to go.
(edit: in the special case of the | escaping, not using a robust or protected command also induces potential for broken .idx entry, if the original \index could not change catcodes and the command thus expanded; it would then need to expand to "| but we don't want that ultimately in the normal case where the command is not expanded. Anyway whether protected or robust they give extra space in .idx file if catcode of backslash could not be sanitized. So in the context here {"|} looks the best among the two proposals)
viz my comment about tabulary there are problems with extra space whether using robust or protected commands
it could be that this is a duplicate of #24
Yes looks like a duplicate.