pdfminer
pdfminer copied to clipboard
dumppdf.py -T throws an exception: "NameError: name 's' is not defined"
dumppdf.py throws an exception when the -T option is used.
pdfminer version 20191125 running with Python 3.7.6 on OS X 10.13.6
To reproduce:
- Create a valid pdf.
- Run dumppdf.py -T on this pdf.
% dumppdf.py -a -T ~/tmp/foo.pdf
<outlines>
Traceback (most recent call last):
File "/usr/local/bin/dumppdf.py", line 272, in <module>
if __name__ == '__main__': sys.exit(main(sys.argv))
File "/usr/local/bin/dumppdf.py", line 269, in main
dumpall=dumpall, mode=mode, extractdir=extractdir)
File "/usr/local/bin/dumppdf.py", line 151, in dumpoutline
outfp.write('<outline level="%r" title="%s">\n' % (level, q(s)))
NameError: name 's' is not defined
I'm not getting this error using the samples/simple1.pdf
file. Could you share the pdf file you are using?
Here is a file that causes the problem.
I have many others that also fail. This file (and the others) were processed by ABBY FineReader OCR software. That might be relevant to this bug.
I had the same problem as @jeffstearns
I changed line 151 on dumppdf.py
to:
outfp.write('<outline level="%r" title="%s">\n' % (level, q(title)))
and got rid of the error message.
After solving the first error message with @igavronski 's workaround, I get another one:
File "dumppdf.py", line 143, in dumpoutline
pageno = pages[dest[0].objid]
TypeError: 'PDFObjRef' object is not subscriptable
Same issue as @jrkager
@jrkager @entelven :: same PDF (dmv.pdf) or another one? Could you please inform the Python version and environment?
@jrkager @entelven :: same PDF (dmv.pdf) or another one? Could you please inform the Python version and environment?
Another PDF. Python 3.9.1 in conda env.
However, the same command on the same PDF worked with pdfminer.six and I got what I wanted.