pypdf Pages without Resources dictionary

Failure using mergePage() with pages that do not have a resource dictionary. This appears to be a valid condition, and the page should then inherit dictionary content from its parent. Trackback below:

Traceback (most recent call last):
  File "C:\Python27\lib\lib-tk\Tkinter.py", line 1536, in __call__
    return self.func(*args)
  File "pdfbind\view.py", line 153, in _on_execute_click
    b.bind()
  File "pdfbind\bind.py", line 176, in bind
    page_bufs.append(page_header.merge(p, header_info))
  File "pdfbind\header.py", line 68, in merge
    header_page.mergePage(orig_page)
  File "PyPDF2\pdf.py", line 2211, in mergePage
    self._mergePage(page2)
  File "PyPDF2\pdf.py", line 2221, in _mergePage
    page2Resources = page2["/Resources"].getObject()
  File "PyPDF2\generic.py", line 512, in __getitem__
    return dict.__getitem__(self, key).getObject()
KeyError: '/Resources'

Jun 23 '16 01:06 jvalenzuela

Could you possibly share the PDF(s) you're working with so I can take a closer look? PyPDF2 does (or is supposed to) support inheritance of missing page attributes from a parent.

Jun 23 '16 17:06 mstamy2

Here's one of the files causing the problem. Starting with some other page from another document, then calling mergePage() with this PDF results in the above error. 108.pdf

Jun 24 '16 03:06 jvalenzuela

While PyPDF2 does allow inheriting certain page attributes, It appears that the none of the page's parents contain the Resources dictionary either. It is a required entry, however I'll try to implement a workaround in strict=False mode

Jun 24 '16 16:06 mstamy2

Was this issue resolved?

May 25 '22 04:05 sjacob90

Here's one of the files causing the problem. Starting with some other page from another document, then calling mergePage() with this PDF results in the above error. 108.pdf tested successfully :

p = PyPDF2.PdfReader("c:/108.pdf")
m = PyPDF2.PdfMerger()
m.append(p)
with open("c:/tt.pdf","wb") as f:
    m.write(f)

issue can be closed

Sep 03 '22 14:09 pubpub-zz

Thank you for checking @pubpub-zz :heart:

Sep 06 '22 19:09 MartinThoma

Maybe I'm missing something but it looks like this: https://github.com/py-pdf/PyPDF2/pull/1276 only fixes the _extract_text function. I'm still having issues with the _merge_page function and this call: original_resources = cast(DictionaryObject, self[PG.RESOURCES].get_object()) when I have a page that is missing the \Resources dict.

  File "/site-packages/PyPDF2/_page.py", line 508, in merge_page
    self._merge_page(page2, expand=expand)
  File "/site-packages/PyPDF2/_page.py", line 532, in _merge_page
    original_resources = cast(DictionaryObject, self[PG.RESOURCES].get_object())
  File "/site-packages/PyPDF2/generic/_data_structures.py", line 149, in __getitem__
    return dict.__getitem__(self, key).get_object()
KeyError: '/Resources'

Sep 14 '22 14:09 FredrikWallstrom

@FredrikWallstrom Which version of PyPDF2 are you using?

Sep 14 '22 14:09 MartinThoma

Which version of PyPDF2 are you using?

2.10.8

Sep 14 '22 15:09 FredrikWallstrom

@FredrikWallstrom to be sure to focus on the real problem, can you provide test file and code

thanks

Sep 14 '22 21:09 pubpub-zz

PDF: 108.pdf

Stupid code example but the principle is the same:

    reader = PdfReader(<108.pdf-stream>)
    page_one = reader.pages[0]
    page_two = reader.pages[0]
    page_one.merge_page(page_two)

Sep 15 '22 05:09 FredrikWallstrom

a good example improves analysis.Thanks

Should be good now

Sep 15 '22 21:09 pubpub-zz

pypdf pypdf copied to clipboard

Pages without Resources dictionary

pypdf
pypdf copied to clipboard