pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

PdfObject.indirect_reference is not available

Open stefan6419846 opened this issue 7 months ago • 5 comments

Creating a simple PdfObject or DictionaryObject and accessing its indirect_reference attribute fails, although it is part of the protocol/PdfObject class.

Initially discovered while looking at #3293.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.4.0-150600.23.47-default-x86_64-with-glibc2.38

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.5.0, crypt_provider=('cryptography', '44.0.0'), PIL=11.1.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf.generic import DictionaryObject, PdfObject

print(PdfObject().indirect_reference)
print(DictionaryObject().indirect_reference)

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "/home/stefan/tmp/scratches/scratch_6.py", line 3, in <module>
    print(PdfObject().indirect_reference)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'PdfObject' object has no attribute 'indirect_reference'

stefan6419846 avatar May 22 '25 11:05 stefan6419846

hello. is this still a issue? i copyed the code to see if the error still existed. i wanted to know if this still considered unexpected behavior.

HSY-999 avatar Dec 11 '25 18:12 HSY-999

Yes, this still is an issue which needs investigation. Ideally, we are able to identify the limits of the current protocol-based approach (or the misunderstanding) here before looking into possible solutions.

stefan6419846 avatar Dec 11 '25 19:12 stefan6419846

Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.

pubpub-zz avatar Dec 11 '25 21:12 pubpub-zz

Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.

would that mean:

attribute not found -> manually created via instantiation, not write to any pdf indirect_reference is None -> part of pdf isinstance(indirect_reference, IndirectObject) -> clone of PdfObject

i hope i make sense

HSY-999 avatar Dec 11 '25 22:12 HSY-999

Having or not the attribute was a way to detect if the object was manually created.

This sounds like a bad approach to implement this and violates the protocol which I would see as a contract for these classes, as well as not really being documented in an obvious fashion. Thus, I would assume that every class which is based upon the protocol defining the indirect_reference would actually provide an indirect_reference - either being None or a proper reference. If we really want to check whether an object belongs to a PDF file, we could still check whether indirect_reference is not None and whether its pdf attribute is the correct one.

Given that explanation, solving this issue might become more complex as it requires reviewing all current lines which rely on the "wrong" behavior.

stefan6419846 avatar Dec 12 '25 07:12 stefan6419846

Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.

would that mean:

correction

attribute not found -> manually created via instantiation, not write to any pdf
indirect_reference is None -> within an array or other containing object in pdf
isinstance(indirect_reference, IndirectObject) -> PdfObject which can be addressed with its index```
 

pubpub-zz avatar Dec 13 '25 10:12 pubpub-zz