pagelabels-py icon indicating copy to clipboard operation
pagelabels-py copied to clipboard

Using the script changes the PDF version from 1.5 to 1.3

Open JackDeSomeTrades opened this issue 6 years ago • 9 comments

I tried to use the script to change the page labels. This works perfectly but manages to change the PDF version rendering the final version incompatible with a lot of archives.

edit: On digging further, I believe this is a problem with the pdfwriter python module.

https://github.com/pmaupin/pdfrw/blob/master/pdfrw/pdfwriter.py

It can only handle writing PDF 1.3.

Perhaps a warning can be placed in the readme.

JackDeSomeTrades avatar Nov 06 '19 14:11 JackDeSomeTrades

PDF 1.3 is almost 20 years old, what software are you using that does not support this version of PDF ? Is it common ?

EDIT: Sorry, I did not see the problem was that the PDF version went from 1.5 to 1.3.

lovasoa avatar Nov 06 '19 15:11 lovasoa

Have you considered using pikepdf instead of pdfrw? I just used it (from interactive Python prompt) to edit the Pagelabels and to set PageLayout = /TwoPageRight and it seemed fairly well made and writes PDF version 1.7.

hackerb9 avatar Sep 04 '21 02:09 hackerb9

GNU Ghostscript can convert PDF 1.3 to PDF 1.7.

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.7 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

ambiamber avatar Aug 06 '22 11:08 ambiamber

Have you considered using pikepdf instead of pdfrw? I just used it (from interactive Python prompt) to edit the Pagelabels and to set PageLayout = /TwoPageRight and it seemed fairly well made and writes PDF version 1.7.

Can you share your code?

Because docs say:

There is currently no API to help with modifying the pdf.Root.PageLabels data structure, which contains the label definitions.

raffaem avatar Sep 15 '23 19:09 raffaem

Because docs say:

There is currently no API to help with modifying the pdf.Root.PageLabels data structure, which contains the label definitions.

It's been a couple years since I did that and I seem to recall just following along with the docs. ~~Maybe that was experimental functionally that got removed?~~ [Edit: Nope, it is still there (see below). The docs are misleading. What they should say is that editing the PageLabels is so easy that there is no need for a special API.]

hackerb9 avatar Sep 20 '23 16:09 hackerb9

The recipe is documented at:

https://pikepdf.readthedocs.io/en/latest/api/models.html#pikepdf.NumberTree

Here is a complete working example:


from pikepdf import open as Pdfopen, Name, Dictionary, NumberTree
pdf=Pdfopen("input.pdf")
try:
    pdf.Root.PageLabels
except:
    nt = NumberTree.new(pdf)
    pdf.Root.PageLabels = nt.obj

pagelabels = NumberTree(pdf.Root.PageLabels)

# Label pages starting at 0 with lowercase Roman numerals
pagelabels[0] = Dictionary(S=Name.r)

# Label pages starting at 6 with decimal numbers
pagelabels[6] = Dictionary(S=Name.D)

pdf.save('output.pdf')

# Page labels will now be:
# i, ii, iii, iv, v, 1, 2, 3, ...

hackerb9 avatar Sep 20 '23 17:09 hackerb9

And here are the settings you can use for the Dictionary in pagelabels (copied from the answer to how to use qpdf to directly edit a PDF's page numbers in a text editor, so you'll have to interpolate a little):

OPTIONAL: STARTING FROM A DIFFERENT NUMBER WITH /St

Each section restarts numbering at 1 unless you tell it otherwise using /St.

OPTIONAL: USING A DIFFERENT STYLE WITH /S

The /S operator takes an argument that lets you pick the numbering style,

  • /D digits (1, 2, 3...)
  • /R uppercase Roman (I, II, III...)
  • /r lowercase Roman (i, ii, iii...)
  • /A uppercase alphabetical (A, B, C, ...., X, Y, Z, AA, AB, AC,...)
  • /a lowercase alphabetical (a, b, c, ...., x, y, z, aa, ab, ac,...)

If one omits the /S operator, then that section of pages will have no numbering. For example:

0 << >> 		% No label for cover

OPTIONAL: ADDING A PREFIX TO EACH PAGE WITH /P

You can show any string of text before the page number by specifying a word in parentheses after /P:

  31
  <<
    /S /D
    /P (A-)		% label appendix pages A-1, A-2, A-3
  >>

Specifying a prefix without a style (/S), will give you pages that have only the word without any number. This can be useful, for example, if you'd like a cover page to simply have the label "Cover".

     0 << /P (Cover) >> 	   % No number, just "Cover"

hackerb9 avatar Sep 20 '23 17:09 hackerb9

You know what, I'll just make a proper answer on StackExchange: https://superuser.com/a/1809284/400780

hackerb9 avatar Sep 20 '23 20:09 hackerb9

@hackerb9 Thank you

raffaem avatar Sep 20 '23 23:09 raffaem