pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

Writer Object Containing AcroForm does not contain /Fields causing issues in writer.update_page_form_field_values()

Open martzro opened this issue 1 year ago • 2 comments

The PDF i'm attempting to fill in is having a known issue where all of the fields appear blank unless clicked on inside of Acrobat. I managed to narrow the issue down to a specific button (check box) in my PDF that when activated (Updated its value to /YES) would cause all other fields to break.

The fix in the new PyPDF update seemed to be adding writer.set_need_appearances_writer(True) to update the writer object to include the AcroForm object. However in the update there is also a check for a /Fields object inside of AcroForm which does not exist -- this is causing my code to break.

If this code is commented out (lines 1,011-1,012 in ._writer.py):

    if InteractiveFormDictEntries.Fields not in af:
        raise PyPdfError("No /Fields dictionary in Pdf in PdfWriter Object")

the code runs except now i get a new error:

Font dictionary for /Helvetica not found.

however when exporting the PDF my problematic checkbox is checked and all fields are included without having to click them (original issue) except now the font is messed up.

Environment

Which environment were you using when you encountered the problem?

Windows-10-10.0.19045-SP0
Python 3.10

Code + PDF

from pypdf import PdfReader, PdfWriter
import requests


def make_universal_permit(pdf_form):
    name = 'Test'

    # Set up pdf reader
    reader = PdfReader(pdf_form)
    writer = PdfWriter()
    writer.set_need_appearances_writer(True)

    fields_to_change = [
        # Test fields
        {"structural": "/Yes",
         "contractor": "/Yes",
         "New": "/Yes",
         "Primary-permit": "/Yes",  # This is the problematic check box
         "Property_Owner": name,
         "Property_owner_address": name,
         "Property_owner_city": name,
         },
        {
            "Same-as-owner": "/Yes",
            "bonding-company-not-applicable": "/Yes"
        },
    ]

    # Make changes per page
    for page_no, page in enumerate(reader.pages):
        writer.add_page(page)
        writer.update_page_form_field_values(
            writer.pages[page_no], fields_to_change[page_no]
        )

    EXPORT_NAME = 'test_pdf.pdf'
    with open(EXPORT_NAME, "wb") as output_stream:
        writer.write(output_stream)


# Download PDF and update
url = 'https://discover.pbcgov.org/pzb/building/Forms/Building%20Division%20Permit%20Application.pdf'
response = requests.get(url)
if response.status_code == 200:
    file = response.content
    open('test_file.pdf', 'wb').write(file)
    make_universal_permit('test_file.pdf')

PDF link is included

Traceback

When running without any changes:

Traceback (most recent call last):
  File "C:\Users\Joey\Desktop\PE Forms\classes\test.py", line 47, in <module>
    make_universal_permit('test_file.pdf')
  File "C:\Users\Joey\Desktop\PE Forms\classes\test.py", line 32, in make_universal_permit
    writer.update_page_form_field_values(
  File "C:\Users\Joey\Desktop\PE Forms\venv\Lib\site-packages\pypdf\_writer.py", line 1012, in update_page_form_field_values
    raise PyPdfError("No /Fields dictionary in Pdf in PdfWriter Object")
pypdf.errors.PyPdfError: No /Fields dictionary in Pdf in PdfWriter Object

When commenting out /Fields check:

Font dictionary for /Helvetica not found.
Font dictionary for /Helvetica not found.
Font dictionary for /Helvetica not found.

When not using the writer.set_need_appearances_writer(True):

Traceback (most recent call last):
  File "C:\Users\Joey\Desktop\PE Forms\classes\test.py", line 47, in <module>
    make_universal_permit('test_file.pdf')
  File "C:\Users\Joey\Desktop\PE Forms\classes\test.py", line 32, in make_universal_permit
    writer.update_page_form_field_values(
  File "C:\Users\Joey\Desktop\PE Forms\venv\Lib\site-packages\pypdf\_writer.py", line 1009, in update_page_form_field_values
    raise PyPdfError("No /AcroForm dictionary in PdfWriter Object")
pypdf.errors.PyPdfError: No /AcroForm dictionary in PdfWriter Object

martzro avatar Oct 27 '23 23:10 martzro

writer.update_page_form_field_values(
            writer.pages[page_no], fields_to_change[page_no]
        )

fields_to_change - this should be a dictionary, I can't imagine what to write here, I extracted information about the field that I want to change in the pdf document, I extract information with the commands:

fields = reader.get_fields()
print(fields)

The result is like this:

{'AFF000': {'/T': 'AFF000', '/FT': '/Tx', '/Ff': 0, '/V': 'hello world!', '/DV': 'hello world!'}}

Then I write this into the construction:

writer.update_page_form_field_values(
writer.pages[0], {'AFF000': {'/T': 'AFF000', '/FT': '/Tx', '/Ff': 0, '/V': 'NEW TEXT', '/DV': 'WTF?'}}
)

I tried to compose a dictionary in different ways, as a result, the error PyPdfError("No /AcroForm dictionary in PdfWriter Object") appears on the command writer.update_page_form_field_values

vvnbel avatar Feb 27 '24 19:02 vvnbel

@martzro you are inserting the document page per page but the field structure are "out of" the pages you should clone the document instead (not tested / may have some 'spelling' error:

from pypdf import PdfReader, PdfWriter
import requests


def make_universal_permit(pdf_form):
    name = 'Test'

    # Set up pdf reader
    reader = PdfReader(pdf_form)
    writer = PdfWriter(clone_from=reader)               # <--- 
    writer.set_need_appearances_writer(True)

    fields_to_change = [
        # Test fields
        {"structural": "/Yes",
         "contractor": "/Yes",
         "New": "/Yes",
         "Primary-permit": "/Yes",  # This is the problematic check box
         "Property_Owner": name,
         "Property_owner_address": name,
         "Property_owner_city": name,
         },
        {
            "Same-as-owner": "/Yes",
            "bonding-company-not-applicable": "/Yes"
        },
    ]

    # Make changes per page
   # <--    change the looping
    for page in enumerate(writer.pages):
        writer.update_page_form_field_values(
            page, fields_to_change[page_no]
        )

   # <-- simpler
   writer.write( 'test_pdf.pdf')


# Download PDF and update
url = 'https://discover.pbcgov.org/pzb/building/Forms/Building%20Division%20Permit%20Application.pdf'
response = requests.get(url)
if response.status_code == 200:
    file = response.content
    open('test_file.pdf', 'wb').write(file)
    make_universal_permit('test_file.pdf')

pubpub-zz avatar Feb 27 '24 20:02 pubpub-zz

Without Feedbacks, I close this issue as solved

pubpub-zz avatar Apr 07 '24 17:04 pubpub-zz