PyPDF4
PyPDF4 copied to clipboard
A utility to read and write PDFs with Python
It would be better to fetch document info like title, author, etc. by reading the first page and getting this info since for some pdfs which don't have those fields...
Unable to read pdf files with strange filenames. For instance, I have pdfs with filenames `[author1;author2;authorN]_pdf_title_(comment).pdf` Every time I try to read a file it throws an error of "unexpected...
As can be clearly seen, `generic.py` stores PDF object definitions and other ancillaries. I know I am being a terrible nag here for all the subliminal changes to the codebase,...
Probably the current versioning style is already close to it, but I propose to stick more seriously to the succinct specification of [Semantic Versioning (semver) 2.0.0](https://semver.org/). This will aid communicating...
When PdfFileMerger merges pdf/a files, it loses pdf/a information and resets the PDF Version to 1.3. Example pdf/a information: ``` 1 A LibreOffice 6.1 Draw 2019-04-03T06:18:04Z ``` [pdf/a](https://en.wikipedia.org/wiki/PDF/A) is a...
Some of the unit tests I have developed rely on PDF files that have certain features. In Calibre, I own a collection of 109+ PDF books, but amongst them I...
Currently the `addAttachment` function is able to embed one single file only. If this function is called for another file, an existing attachment is overwritten. It would be useful to...
While deploying unit tests for `filters.py` I have noticed that the use of `assert` is much more frequent than a reasonable exception. AFAIK, even in Python the `assert` keyword has...
I'm in the process of developing unit tests for all the classes in [filters.py](https://github.com/claird/PyPDF4/blob/master/PyPDF4/filters.py) and with the following code I get several kinds of exceptions, based on the input: ```...