pdfrw icon indicating copy to clipboard operation
pdfrw copied to clipboard

error reading fillable pdf

Open lylia0009 opened this issue 1 year ago • 1 comments

I am trying to update a fillable PFD form with python, when trying to read it it throws an exception. Below are the code, sample pdf and the error thrown. Is there any one seen this before?

import pdfrw pdf = pdfrw.PdfReader(input_pdf) Error message

~\Anaconda3\lib\site-packages\pdfrw\pdfreader.py in init(self, fname, fdata, decompress, decrypt, password, disable_gc, verbose) 648 649 if is_stream: --> 650 self.load_stream_objects(trailer.object_streams) 651 652 while xref_list:

~\Anaconda3\lib\site-packages\pdfrw\pdfreader.py in load_stream_objects(self, object_streams) 306 firstoffset = int(obj.First) 307 while objsource.floc < firstoffset: --> 308 offsets.append((int(next()), firstoffset + int(next()))) 309 for num, offset in offsets: 310 # Read the object, and call special code if it starts

ValueError: invalid literal for int() with base 10: "ÔÞíWÎ0vætºUÐ\x8a\x13õ#v\x9f0uÀ\x08C5n¡³ñų\x9d\x93\x91\x06Åo\x11j\x8eO8êøÏ\x96\x1fá?\x10ãoÂõõÀù,,1!6HêíG\x1eb\x18½'éïz²å½#¸,\x9e"```

Sample pdf https://www.uobgroup.com/hk/assets/pdfs/Billof-Exchange.pdf

lylia0009 avatar Sep 23 '24 02:09 lylia0009

I have a similar issue while reading a PDF with forms:

[ERROR] uncompress.py:80 Error -3 while decompressing data: incorrect header check (1, 0)
[ERROR] uncompress.py:80 Error -3 while decompressing data: incorrect header check (3, 0)
[ERROR] uncompress.py:80 Error -3 while decompressing data: incorrect header check (5, 0)
[ERROR] uncompress.py:80 Error -3 while decompressing data: incorrect header check (6, 0)
Traceback (most recent call last):
  File "test.py", line 32, in <module>
    main()
  File "test.py", line 28, in main
    fill_pdf("test.pdf", None, {})
  File "test.py", line 5, in fill_pdf
    template_pdf = PdfReader(input_pdf)
                   ^^^^^^^^^^^^^^^^^^^^
  File "test/.venv/lib/python3.12/site-packages/pdfrw/pdfreader.py", line 648, in __init__
    self.load_stream_objects(trailer.object_streams)
  File "test/.venv/lib/python3.12/site-packages/pdfrw/pdfreader.py", line 306, in load_stream_objects
    offsets.append((int(next()), firstoffset + int(next())))
                    ^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'GZê¿S¬§£Ï&^àØVzu\x91ú·q\x13Ïá¥J\x87\x0bxý'

atoav avatar Jan 21 '25 10:01 atoav