pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

PyPDF2 appends the same file

Open Mahdi-Hosseinali opened this issue 6 years ago • 3 comments

I'm trying to automate filling a pdf form (not surprising that it's always a bad pdf). PyPdf2 seems to be the best option despite all the bugs python packages have for pdfs such as fields not showing in acroreader and being unable to fill checkboxes or radio buttons. To overcome the checkbox shortage, I'm trying to append all the generated files so user can just scroll over and do it manually before printing the form. However no matter how I'm trying this, the same pdf page gets appended multiple time, even though individual files are correct. To reproduce the problem try the following with the linked pdf:

from PyPDF2 import PdfMerger, PdfReader, PdfWriter


reader = PdfReader("staffchange2015blk.pdf")
page = reader.pages[0]

# Fill PDF and store it for 3 people
for i in range(3):
    writer = PdfWriter()
    writer.update_page_form_field_values(page, {"Name[0]": "user" + str(i)})
    writer.add_page(page)
    with open(str(i) + ".pdf", "wb") as fh:
        writer.write(fh)

# Merge all 3 documents
merger = PdfMerger()
for i in range(3):
    merger.append(str(i) + ".pdf")

with open("batch.pdf", "wb") as fh:
    merger.write(fh)

Mahdi-Hosseinali avatar Nov 18 '17 18:11 Mahdi-Hosseinali

@meylone I ran your code. The merged file called "batch.pdf" looks OK to me when viewed through non-PDF-specific reader apps on my iMac such as Preview, Safari (browser) or iBooks. The LAST NAME field is filled with a different name, i.e. user0, user1, etc, on different pages in the merged file.

When I open the same file with Chrome (browser), however, it looks like it always has "user0" on all pages. In addition, when I open the same file (or the individual PDF files) with a PDF-specific reader such as Adobe Acrobat Pro X or Adobe Acrobat Reader DC, the field value for LAST NAME comes out only after I click in the LAST NAME field manually but nothing in the second or third pages of the merged file even when clicked.

Yet, if I export the merged file as PDF using the Preview app to a new PDF file, the new PDF file looks OK whichever app I use to view it including Adobe Acrobat Pro X. I reckon this is because the new PDF file is no longer in PDF form structure after I export it through the Preview app.

In summary, the problem you are facing seems to come from both the PDF viewer you might be using post-merger and the characteristics of the merge method in PyPDF2 vs form structure of PDF files, if any. You might be able to overcome this problem by sticking to a non-specific PDF reader to view or print the merged file, but, of course, this may not be a stable solution in the long run or something that would work on all operating systems, i.e. Windows or Linux.

apaksoy avatar Nov 22 '17 02:11 apaksoy

Thanks, unfortunately the code will be used in windows and cannot tryout options you mentioned. I'm not sure if the problem could be dependent on the viewer. If you fill a form manually, it looks fine regardless of the viewer, so to me it seems the problem is by the way PyPDF2 is handing the structure of pdf (if you fill an individual file, it doesn't show unless you click on the field which is a well-known problem). PyPDF2 is a handy library and easy to use (comparing to many others out there which you need to dig up a lot about pdf structure before being able to do anything). I hope it's maintenance be continued and small issues such as this be solved.

Mahdi-Hosseinali avatar Nov 27 '17 15:11 Mahdi-Hosseinali

@Mahdi-Hosseinali The document you've linked is no longer available. Do you have another example file?

I don't see an issue with resources/forms.py except for #355.

MartinThoma avatar Aug 06 '22 12:08 MartinThoma

@Mahdi-Hosseinali, some improvevement have been introduced for Merging forms https://pypdf.readthedocs.io/en/latest/user/merging-pdfs.html#merging-forms Can you try and confirm the problem is fixed

pubpub-zz avatar Feb 09 '23 05:02 pubpub-zz

Unfortunately, I don't have access to this PDF anymore. I guess we can mark this close.

Mahdi-Hosseinali avatar Feb 09 '23 06:02 Mahdi-Hosseinali