PdfObject.indirect_reference is not available
Creating a simple PdfObject or DictionaryObject and accessing its indirect_reference attribute fails, although it is part of the protocol/PdfObject class.
Initially discovered while looking at #3293.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
Linux-6.4.0-150600.23.47-default-x86_64-with-glibc2.38
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.5.0, crypt_provider=('cryptography', '44.0.0'), PIL=11.1.0
Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf.generic import DictionaryObject, PdfObject
print(PdfObject().indirect_reference)
print(DictionaryObject().indirect_reference)
Traceback
This is the complete traceback I see:
Traceback (most recent call last):
File "/home/stefan/tmp/scratches/scratch_6.py", line 3, in <module>
print(PdfObject().indirect_reference)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'PdfObject' object has no attribute 'indirect_reference'
hello. is this still a issue? i copyed the code to see if the error still existed. i wanted to know if this still considered unexpected behavior.
Yes, this still is an issue which needs investigation. Ideally, we are able to identify the limits of the current protocol-based approach (or the misunderstanding) here before looking into possible solutions.
Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.
Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.
would that mean:
attribute not found -> manually created via instantiation, not write to any pdf
indirect_reference is None -> part of pdf
isinstance(indirect_reference, IndirectObject) -> clone of PdfObject
i hope i make sense
Having or not the attribute was a way to detect if the object was manually created.
This sounds like a bad approach to implement this and violates the protocol which I would see as a contract for these classes, as well as not really being documented in an obvious fashion. Thus, I would assume that every class which is based upon the protocol defining the indirect_reference would actually provide an indirect_reference - either being None or a proper reference. If we really want to check whether an object belongs to a PDF file, we could still check whether indirect_reference is not None and whether its pdf attribute is the correct one.
Given that explanation, solving this issue might become more complex as it requires reviewing all current lines which rely on the "wrong" behavior.
Having or not the attribute was a way to detect if the object was manually created. If part of a PdfDocument the Indirect_Reference is None or an IndirectObject depending of the relevance.
would that mean:
correction
attribute not found -> manually created via instantiation, not write to any pdf
indirect_reference is None -> within an array or other containing object in pdf
isinstance(indirect_reference, IndirectObject) -> PdfObject which can be addressed with its index```