pdfminer icon indicating copy to clipboard operation
pdfminer copied to clipboard

dumppdf.py -T throws an exception: "NameError: name 's' is not defined"

Open jeffstearns opened this issue 5 years ago • 7 comments

dumppdf.py throws an exception when the -T option is used.

pdfminer version 20191125 running with Python 3.7.6 on OS X 10.13.6

To reproduce:

  1. Create a valid pdf.
  2. Run dumppdf.py -T on this pdf.
% dumppdf.py -a -T ~/tmp/foo.pdf
<outlines>
Traceback (most recent call last):
  File "/usr/local/bin/dumppdf.py", line 272, in <module>
    if __name__ == '__main__': sys.exit(main(sys.argv))
  File "/usr/local/bin/dumppdf.py", line 269, in main
    dumpall=dumpall, mode=mode, extractdir=extractdir)
  File "/usr/local/bin/dumppdf.py", line 151, in dumpoutline
    outfp.write('<outline level="%r" title="%s">\n' % (level, q(s)))
NameError: name 's' is not defined

jeffstearns avatar Jan 12 '20 19:01 jeffstearns

I'm not getting this error using the samples/simple1.pdf file. Could you share the pdf file you are using?

pietermarsman avatar Jan 13 '20 21:01 pietermarsman

Here is a file that causes the problem.

I have many others that also fail. This file (and the others) were processed by ABBY FineReader OCR software. That might be relevant to this bug.

dmv.pdf

jeffstearns avatar Jan 15 '20 04:01 jeffstearns

I had the same problem as @jeffstearns I changed line 151 on dumppdf.py to:

                outfp.write('<outline level="%r" title="%s">\n' % (level, q(title)))

and got rid of the error message.

igavronski avatar Jul 21 '20 22:07 igavronski

After solving the first error message with @igavronski 's workaround, I get another one:

File "dumppdf.py", line 143, in dumpoutline
    pageno = pages[dest[0].objid]
TypeError: 'PDFObjRef' object is not subscriptable

jrkager avatar Jan 29 '21 15:01 jrkager

Same issue as @jrkager

ghost avatar Jan 29 '21 16:01 ghost

@jrkager @entelven :: same PDF (dmv.pdf) or another one? Could you please inform the Python version and environment?

igavronski avatar Jan 29 '21 19:01 igavronski

@jrkager @entelven :: same PDF (dmv.pdf) or another one? Could you please inform the Python version and environment?

Another PDF. Python 3.9.1 in conda env.

However, the same command on the same PDF worked with pdfminer.six and I got what I wanted.

jrkager avatar Jan 30 '21 22:01 jrkager