accessibility icon indicating copy to clipboard operation
accessibility copied to clipboard

Page without StructParents, syntax problem

Open viktoriasee opened this issue 5 years ago • 10 comments

Steps to reproduce

Run this minimal example either in pdftex or lualatex:

\documentclass{scrreprt}
\usepackage{luatex85}
\usepackage[tagged]{accessibility}

\begin{document}
Text.
\end{document}

Check the output generated in PAC version 3.0.7.0 pac3-latex-accessibility-minimal_crop

You get an error page without StructParents.

Expected behaviour (correct)

The StructParents entry should be there.

viktoriasee avatar Feb 11 '20 11:02 viktoriasee

@viktoriasee I've invited you to join the project as a collaborator as you seem to have time to spend on it, and I'd appreciate some help! This might be easier than using forks and pull requests.

AndyClifton avatar Feb 12 '20 07:02 AndyClifton

I feel honoured, thanks. I indeed have some time but I am not a programmer so I need help.

\documentclass{scrreprt}
\usepackage{tagpdf}

\tagpdfsetup{activate-all}

\begin{document}
Text.
\end{document}

in pdftex creates the StructParent. Is this a hint? Maybe we should bring Ulrike Fischer on board. structparent-minimal

viktoriasee avatar Feb 12 '20 09:02 viktoriasee

A pdf without the error like above will contain something like

<<
/Type /Page
/Contents 17 0 R
/Resources 16 0 R
/MediaBox [0 0 612 792]
/StructParents 0/Tabs/S
/Parent 21 0 R
>>

A PDF as it's produced by accessibility right now looks like this:

<<
/Type /Page
/Contents 17 0 R
/Resources 16 0 R
/MediaBox [0 0 612 792]
/Parent 21 0 R
>>

One can use tagpdf with parameter uncompress to create a human readable pdf.

viktoriasee avatar Feb 12 '20 10:02 viktoriasee

I've learned from the reference p.147 that Structparents for page objects are mandatory for a tagged PDF. They may be needed for other objects such as images too.

viktoriasee avatar Feb 13 '20 14:02 viktoriasee

Source of error

It looks like the general PDF object is written to PDF in accessibility.sty in lines 568 to 575:

\immediate \pdfobj useobjnum \theStructTree{%
    <</Type /StructTreeRoot %
        /RoleMap \theObjHelp \space 0 R %
        /ClassMap \theClassMap \space 0 R %
        /ParentTree <</Nums [0 [\Karray]]>> % TODO Viel komplizierter
        /ParentTreeNextKey 1 % berechnen
        /K [\Karray] %
    >>}\pdfrefobj\pdflastobj%

(and line 1032 to 1039 of the .dtx file, which is where the changes will need to be made to propagate correctly; changing the .sty file in tests/article is fine for testing)

mitigation

If I understand this right, it means that if /StructTreeRoot is page, then we need to add /StructParents 0/Tabs/S, where the value is ..

the integer key of the page's entry in the structural parent tree

And that value is defined / described in "finding structural elements from content items" on page 868 of the manual.

How to proceed

Suggested approximate steps to correct this:

  • work out where that page object is defined
  • identify what the integer value should be
  • create a new output from /StructTreeRoot that includes the correct /StructParents 0/Tabs/S

N.B. I think I understand why this was left as a "TODO"....

AndyClifton avatar Feb 16 '20 10:02 AndyClifton

Is it really that complicated? When I add \pdfpageattr{/StructParents 0/Tabs/S} to my document preamble the error is gone.

viktoriasee avatar Feb 26 '20 10:02 viktoriasee

Ok, this could be a solution.

Could you extend the MWE with a page break and see if this fix still works, please?

AndyClifton avatar Feb 26 '20 12:02 AndyClifton

The mwe has a page break and it works: https://github.com/AndyClifton/accessibility/blob/master/tests/article/minimal-pdftex.tex

viktoriasee avatar Feb 26 '20 12:02 viktoriasee

@viktoriasee, when you have chance, could you try one thing for me, please?

Try adding the option tagged and either flatstructure or highstructure to the call to accessibility, i.e.,

\usepackage[tagged, flatstructure]{accessibility}

and see if that changes anything?

AndyClifton avatar Jun 28 '20 08:06 AndyClifton

tagged was there already. And both highstructure or flatstructure do not make a difference. But \pdfpageattr{/StructParents 0/Tabs/S} does.

viktoriasee avatar Jun 30 '20 11:06 viktoriasee