pypdf
pypdf copied to clipboard
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Addresses #1185
The inline image parser does not look for whitespace before the `EI` keyword as it should. Thus if you have a content stream as follows, the parser would crash: ```...
## Explanation More complete feature support for annotations. ## Code Example Just like the other annotations ## Hints * TABLE 8.34 Additional entries specific to a pop-up annotation * See...
## Explanation The pattern `writer.add_metadata(reader.metadata)` doesn't work as there can be indirect objects. This is unfortunate and an unnecessary complexity for the user. ## Code Example ```python from PyPDF2 import...
Hi! I would like the possibility of merging in pages as hidden attachments, is this possible with pypdf at all? Edit: This seems to have been added now (thanks!), but...
extractText() cpu/memory utilization is massive for the following 1 page 3 MB file. The extraction doesn't complete and the process has to be killed. http://www.dora.state.co.us/pls/efi/efi_p2_v2_demo.show_document?p_dms_document_id=105933&p_session_id=
Issue with text extraction (spacing) ## Environment Which environment were you using when you encountered the problem? windows 10 ``` $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.7.0 ``` ## Code +...
Chapter "10.7 Tagged PDF" (PDF 1.7 standard) is something that PyPDF2 doesn't support at all... at least I think so. I need to read that chapter (+ several related other...
The following script originally hanged, but with PyPDF2==2.4.2 we get `PdfReadError: EOF marker not found`. ## MCVE: PDF + Code [This file](https://www.puc.nh.gov/Regulatory/CASEFILE/2001/01-006%20THROUGH%20MARCH%202010/01-006%202009-04-30%20FRP%20NON%20CONFIDENTIAL%20PAP%20FILING.PDF) is 298MB with 21 pages. ```python from PyPDF2...