pdfrw icon indicating copy to clipboard operation
pdfrw copied to clipboard

Form Field Data disappearing on merge

Open wmoskal opened this issue 5 years ago • 8 comments

Hi,

So I Have a fillable PDF that I am using to create a number of PDFs with data. When attempting to merge them into a single pdf, there is undefined behavior, likely because the field names are the same in all of the PDFs that have gone into the merged PDF. Is there a way to get around this with pdfrw? I have attempted to flatten the pdfs individually, but that just removes the form data, i have attempted to add a need appearance tag to the pdf, which did not work, and I have attempted to convert the pdf to postscript and then back again, which did not work for the textual data, but did work for the check boxes. Any help would be greatly appreciated

wmoskal avatar Jul 10 '19 18:07 wmoskal

Hi, i had the same problem and coded myself a solution, look at the answer in this post on stack overflow: https://stackoverflow.com/questions/57008782/pypdf2-pdffilemerger-loosing-pdf-module-in-merged-file It is an incomplete solution but it works for me and can be a starting point for pdfrw mantainers to build on. Hope this helps.

AldoErco avatar Sep 17 '19 07:09 AldoErco

@AldoErco Any advice on using this method when merging two pages instead of appending them?

alechalama avatar Sep 04 '20 01:09 alechalama

What do you mean exactly with "merging two pages"? e.g. how the content on line 10 of both pages should be merged? As for merging two forms in one I think the same method can be applied: you just need a way to work out a unique name for each filed, in the worst case just appendo to the field names a randomly generated UUID and that will work. But I am not sure I understand what "merging two pages" means. BR

AldoEng avatar Sep 04 '20 09:09 AldoEng

I am struggling with the same issue. I created separate pdf files from a pdf form file I had and I inserted all the values in each file successfully making them readonly using the AP flag in each field (it still is a pdf form file though). However, when I use the PdfWriter to append each page from the separate pdf files I created, the final output is completely blank.

ghost avatar Sep 16 '20 08:09 ghost

Just to be sure I understand. You have PDF file A. From this, you create 2 new PDFs: B and C. You make B and C readonly using the AP flag in each field. Then you try to create PDF file D by appending B and C again. Doing so results in a completely blank D PDF file. Correct?

AldoEng avatar Sep 16 '20 17:09 AldoEng

@AldoEng Exactly

ghost avatar Sep 17 '20 05:09 ghost

The typical pattern I've seen posted for merging PDFs is something like

writer = PdfWriter()
for fname in files:
      r = PdfReader(fname)
     writer.addpages(r.pages)
writer.write("output.pdf")

This of course loses the interactive form features because the interactive form dictionary AcroForm is not copied. To do this accurately, you need to merge the dictionaries. But if they PDF's come form the same source (basically the same form with the fields renamed, you can just copy the AcroForm from one. The above posted stackoverflow link explains the merging. Under the assumption that the interactive form dictionaries are essentially the same, you can merge using this:

writer = PdfWriter()
r = PDfReader(files[0])
writer.addpages(r.pages)
acro_form = r.Root.AcroForm
for fname in files[1:]:
      r = PdfReader(fname)
     writer.addpages(r.pages)
writer.trailer.Root.AcroForm=acro_form
writer.write("output.pdf")

I basically copied the interactive form dictionary from the first file and copied to my output. Some of the settings control fonts,colors, spacing of the form elements. There may be situations where not all form characteristics can be preserved. Anyhow, this tidbit works in my use case where I fill a single PDF multiple times and merge them into 1 file

summerswallow-whi avatar Mar 02 '21 00:03 summerswallow-whi

@summerswallow-whi This solution worked for me as well. I was able to apply the AcroForm as a writer.trailer to a writer of any length and still have the fillable text reappear.

sage-gendron avatar Sep 14 '21 15:09 sage-gendron