pypdf
pypdf copied to clipboard
Unspecific type hints for reader.metadata
Take the below example file:
from PyPDF2 import PdfReader
with open("example.pdf", "rb") as fp:
reader = PdfReader(fp)
metadata = reader.metadata
assert metadata is not None
date_str = metadata["/CreationDate"]
date_str = date_str.removeprefix("D:").replace("'", "")
print(date_str)
It runs fine:
$ python example.py
20220415093243+0200
but Mypy complains about using remove_prefix()
on date_str
:
$ mypy example.py
example.py:8: error: "PdfObject" has no attribute "removeprefix" [attr-defined]
Found 1 error in 1 file (checked 1 source file)
This is due to DocumentInformation
being a subclass of DictionaryObject
, and thus only guaranteeing that the values returned are PdfObject
s. In practice they seem to only be TextStringObject
s, which subclass str
. If they're always TextStringObject
s, the types in DocumentInformation
should be adjusted accordingly.
Environment
$ python -m platform
macOS-12.5-arm64-arm-64bit
$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.9.0
Code + PDF
above, used metadata.pdf
from PyPDF2 resources