pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

Results 221 pypdf issues
Sort by recently updated
recently updated
newest added

I'm trying to extract text from a pdf together with the position of the text. When I do it in pypdf 3.16 I get the expected result, but I don't...

workflow-advanced-text-extraction

## Explanation Hello, I am exploring how to populate a pdf form using pypdf. The pdf form I am working on is the following one: https://www.uspto.gov/sites/default/files/patents/process/file/efs/guidance/updated_IDS.pdf It is used for...

workflow-forms

I am trying to parse [this PDF](https://www.joinville.sc.gov.br/wp-content/uploads/2023/11/Pesquisa-de-Precos-Combustiveis-novembro-2023.pdf). However, I am getting on the output of extract_text() a bunch of spaces that are not in the original PDF. See the screenshot...

is-bug
workflow-text-extraction
Has MCVE
help wanted
whitespace

I get garbled characters when parsing pdf file. The file I use is [this](http://www.aas.net.cn/fileZDHXB/journal/article/zdhxb/2012/8/PDF/20120812.pdf). There may be encoding issues? ## Environment ```bash $ python -m platform Linux-4.18.0-147.5.1.6.h841.eulerosv2r9.x86_64-x86_64-with-glibc2.17 $ python -c...

workflow-text-extraction
Has MCVE

proposal to complete #2203

add capability to change font and size closes #2253

help wanted

provides the same interface to access root,info,id for communalisation The objective is prepare some code factorization between PdfWriter / PdfReader

I am trying to extract images from pdf files, however occasionally it gives 'not enough image data' exception from PIL when handling certain pdf. The files look correct in Atril...

is-bug
workflow-images
Has MCVE

I am trying to use PdfReader and PdfWriter to read/write annotations in pdf file. I use PDF file produced by Microsoft Word -> Save As PDF. Word file has 3...