KeyError '/Pages' is raised when pages are merged
when I merge two pdf files with pdf version 1.3, the follow KeyError is raised.
Traceback (most recent call last):
File "pdf_merge.py", line 56, in <module>
merger.merge()
File "pdf_merge.py", line 29, in merge
self._merger.append(open(temp, "rb"))
File "...\Python35\lib\site-packages\PyPDF2\merger.py", line 203, in append
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
File "...\Python35\lib\site-packages\PyPDF2\merger.py", line 139, in merge
pages = (0, pdfr.getNumPages())
File "...\Python35\lib\site-packages\PyPDF2\pdf.py", line 1155, in getNumPages
self._flatten()
File "...\Python35\lib\site-packages\PyPDF2\pdf.py", line 1506, in _flatten
pages = catalog["/Pages"].getObject()
File "...\Python35\lib\site-packages\PyPDF2\generic.py", line 516, in __getitem__
return dict.__getitem__(self, key).getObject()
KeyError: '/Pages'
same error @euter any resolution?
Okey, my bad, i just called merger.append() after merger.write()
Same error here! I could'nt find anything about this. Any solution?
As it's been a long time and I don't have neither an example PDF nor example code, I'll close this. If anybody still runs into this issue with the latest PyPDF2 version, please let me know!
I'm having the same issue. I have a copy of the PDF that's causing the issue.
UTA_OSHA_3115_Fall_Protection_Training_09162021_.pdf
For some context. I'm using requests to retrieve a PDF from a url, then I copy the the response content into a buffer and from there I attempt to merger the pdf using merger.append. Previously I was saving the file to a temporary location, later I refactored it. This works for 99% of the PDFs I'm merging, but there are 4 PDFs I'm working with that consistently trigger this error.
Thanks for adding an example @red-shift !
MCVE: Code + PDF
UTA_OSHA_3115_Fall_Protection_Training_09162021_.pdf
from PyPDF2 import PdfReader
reader = PdfReader("UTA_OSHA_3115_Fall_Protection_Training_09162021_.pdf")
print(len(reader.pages))
Problem analyzed: the 1st xref is correctly read, but the "/Prev" one is starting at 1. when the first object are checked an offset in the index is detected and then applied to the whole table (by _zero_xref). This is damaging the 1st xref entries inducing the problem. The correction shall be applied object per object. PR in progress