pypdf
pypdf copied to clipboard
After using updatePageFormFieldValues PyPDF2 cannot read fields with getFormTextFields
After I update a pages fields they cannot be read in PyPDF2. I am using the needs appearances trick to make them visible in my pdf viewer (pdf-xchange).
If I open the files with pdf-xchange and close them I can again read the fields with PyPDF2
I noticed the document info of the updated files does not contain the /fields section like so:
original document:
{'/ModDate': "D:20180708222539-06'00'", '/Producer': 'PyPDF2', '/Fields': [IndirectObject(3, 0), IndirectObject(4, 0), IndirectObject(5, 0)]}
updated fields:
{'/NeedAppearances': <PyPDF2.generic.BooleanObject object at 0x0000026DB2265470>, '/Producer': 'PyPDF2'}
I am not sure how to add the fields section back
Thanks
This is related to if not the same as issue #355.
Thank you for sharing that observation. I've created an example to confirm it:
from PyPDF2 import PdfReader, PdfWriter
reader = PdfReader("resources/form.pdf")
writer = PdfWriter()
writer.add_page(reader.pages[0])
writer.write("forms-after-writing.pdf")
Within resources/form.pdf
, we have /AcroForm 22 0 R
within the Catalog:
34 0 obj
<<
/Type /Catalog
/Pages 21 0 R
/Names 33 0 R
/PageMode /UseNone
/AcroForm 22 0 R
/OpenAction 1 0 R
>>
endobj
That object is a field dictionary looking like this:
22 0 obj
<<
/Fields [ 15 0 R ]
/DR <<
/Font <<
/ZaDb 5 0 R
/Helv 6 0 R
>>
>>
/DA (/Helv 10 Tf 0 g)
/NeedAppearances true
>>
endobj
After executing the script above, the Catalog looks like this:
4 0 obj
<<
/Type /Catalog
/Pages 1 0 R
>>
endobj
@MartinThoma was correct :
in order to copy the /AcroForm tree, you have to use append()
we can close this issue