pypdf
pypdf copied to clipboard
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Some images seem to be extracted with the wrong colors. Apparently the custom palette is not considered in these cases. The final byte string from ``` ['/Indexed', '/DeviceRGB', 164, b'\xed\x1c$\xed\x1e&\xed...
Hi all I have been wanting to exract all images in a PDF file as separate image files. The process seems to be causing errors during the extraction of PNG...
Hello, When executing this piece of code: ```python from pypdf import PdfReader,PdfWriter import traceback try: input_pdf = PdfReader(dwnld_filepath) output_pdf = PdfWriter() image = input_pdf.pages[0] output_pdf.add_page(image) output_pdf.write(file_path) except Exception as e:...
I am merging a file that has embedded fonts named following the Adobe PDF 1.7 standard: 496 0 obj endobj ## Environment ```bash $ python -m platform # macOS-14.2.1-arm64-arm-64bit $...
Running the tests will currently generate (and delete) temporary files inside the current working directory. This does not look right and should probably be replaced by appropriate temporary directories/files instead....
_read_inline_image got into infinite loop. I cannot share the pdf itself but I can say that it had images only in it. the issue is here: ` while True: #...
I was trying to use the exact same example mentioned in [here](https://pypdf.readthedocs.io/en/latest/user/extract-text.html#example-1-ignore-header-and-footer), but it gives blank output, even though I copied the same code, and same [PDF file](https://github.com/py-pdf/pypdf/blob/main/resources/GeoBase_NHNC1_Data_Model_UML_EN.pdf). (Fix is...
The below code results in what looks like a bunch of hexadecimal. The first page of the pdf is displayed below, I note that I can copy/paste text normally from...
When encrypting PDF files, there is no verification whether the reserved permission bits are passed correctly. This seems to allow for PDF files which do not completely follow the PDF...
At the moment, CI uses fixed package versions. Despite being under active development, this hides possible incompatibilities with newer package versions, as recently seen for `pytest>=8`. On the other hand,...