OCRmyPDF icon indicating copy to clipboard operation
OCRmyPDF copied to clipboard

set pdf viewer actions

Open femifrak opened this issue 5 years ago • 5 comments

In acrobat there is the possibility to set actions how pdf viewers should act when opening a file, like

  • showing the title or the file name or
  • showing the table of content or
  • showing a certain page (instead of page 1) or
  • adjusting the window size
  • ...

Is it possible to include switches into ocrmypdf to select especially the first two actions? Is there an alternative method in linux without using acrobat?

Thanks!!

femifrak avatar Dec 21 '19 10:12 femifrak

You can look up how to configure these settings in the PDF reference manual and use pikepdf to set them. It's section 12.2 of the manual. Just add a /ViewerPreferences dictionary to the document catalog.

I could see adding an option to enable resizing to the first page (/FitWindow true) since that is more appropriate for some documents and common, and /NonFullScreenPageMode to set a preference for TOC or thumbnails. Unfortunately I don't know when I'll be able to do it, but it's not a difficult feature to add.

The more complex and situation specific things, or if you want those features now, could be done with an external script that uses pikepdf to edit the output files.

/DisplayDocTitle is weird. It says the title should come from the (deprecated) /DocumentInfo dictionary. I'm curious what happened to this in PDF 2.0 (I don't have that spec). Some of the other viewer preferences are user hostile IMO e.g. hiding toolbars; no interest in implementing those.

jbarlow83 avatar Dec 23 '19 09:12 jbarlow83

I tried this but did not succeed. Could you give a clue to a beginner please?

import pikepdf
myfile = pikepdf.open('in.pdf')
pikepdf.Dictionary(ViewerPreferences = {'/FitWindow':'True','/NonFullScreenPageMode':'UseOC'})
myfile.save('new.pdf')

femifrak avatar Sep 26 '20 12:09 femifrak

myfile.root.ViewerPreferences = pikepdf.Dictionary({'/FitWindow':'True','/NonFullScreenPageMode':'UseOC'})

You needed to attach the dictionary to the myfile object.

jbarlow83 avatar Sep 26 '20 22:09 jbarlow83

Thanks, this way I can create the dictionary:

import pikepdf
myfile = pikepdf.open('in.pdf')
myfile.root.ViewerPreferences = pikepdf.Dictionary({
  "/NonFullScreenPageMode": "/UseOutlines"   # or: "/UseNone"
})
myfile.save('new.pdf')


However, setting the viewer preferences is probably not that simple after all (at least for me ;) ).

The presence of "/NonFullScreenPageMode": "/UseOutlines" in myfile.root.ViewerPreferences does not reliably cause the TOC to be displayed, nor is it certain that no TOC will be displayed if the ViewerPreference Dictionary does not exist when the file is opened.

I did several tests with adobe acrobat (9.5) (here you can set this switch explicitly), but did not find a systematic. Are there other places where the TOC is stored or do I have to activate something else at some place?

femifrak avatar Sep 29 '20 04:09 femifrak

The topic table of contents leaves me no peace, because it makes it extremely easy to access certain places in pdfs. So I investigated the topic further and found out the following:

1.) The pdf-readers remember their last status for each file, so that the settings in the pdf-file are not necessarily effective the second time they are opened.

2.) Two (with pdftk) uncompressed pdfs, which differ only in the TOC display setting (set with acrobat), differ in this respect only in "/PageMode /UseNone" or "/PageMode /UseOutlines" and not in "/NonFullScreenPageMode ..." Is the correct name possibly "PageMode" instead of "/NonFullScreenPageMode"?

3.) Regarding DislplayDocTitle the following script works. It even distinguishes between upper and lower case (pikepdf allows only upper case and transforms the value in the pdf into lower case.).

My code looks like this:

import sys
import subprocess
import pikepdf

doc = pikepdf.open(sys.argv[1])
doc.root.ViewerPreferences = pikepdf.Dictionary({
  "/DisplayDocTitle": True,
  "/PageMode": "/UseOutlines"
#  "/NonFullScreenPageMode": "/UseOutlines"
})

doc.save('temp1242.pdf')
doc.close()
subprocess.run(["cp", "temp1242.pdf", sys.argv[1]])
subprocess.run(["rm", "temp1242.pdf"])

Does it look okay so far or am I doing something obvious wrong here, maybe with XMP and DocumentInfo (??) ? (If not, I won't push any further now, but "UseNode" can in an uncompressed pdf be replaced relatively easily with an editor with "UseOutlines", so I still have hope for a simple solution).

Thanks a lot again again!

femifrak avatar Oct 01 '20 08:10 femifrak