pypdf
pypdf copied to clipboard
Javascript code added via addJS gets executed twice
When using addJS with PyPDF 1.26.0 on Python 3.6.1 to add Javascript logic to a PDF, this Javascript gets executed twice when opening the PDF instead of once. When using the following sample code, the message box gets shown twice when opening the file in Adobe Reader XI on Windows 7 instead of only once as expected:
from PyPDF2 import PdfReader, PdfWriter
def extract(input_pdf, output_pdf, page):
reader = PdfReader(input_pdf)
writer = PdfWriter()
page = reader.pages[page - 1]
writer.add_page(pdf_page)
writer.add_js('app.alert("Startup");')
with open(output_pdf, "wb") as f:
writer.write(f)
Sample usage:
extract("my_file.pdf", "my_file_withjs.pdf", 1)
here is the solution to your case. I had the same issue and after digging in the code of pdf.py file in PyPDF2 specifically the addJS() method on the PdfWriter class I found out that they are using /OpenAction on the /JavaScript name.
I have overwritten the method and here is my code
"""
This is the fix to not having the global JavaScript being added twice
instead of calling the original writer.addJS() method I overrode
the addJS method.
"""
def addJS(writer: PdfFileWriter, javascript, jsName):
"""
Add Javascript which will launch upon opening this PDF.
:param str javascript: Your Javascript.
>>> output.addJS("this.print({bUI:true,bSilent:false,bShrinkToFit:true});")
# Example: This will launch the print window when the PDF is opened.
"""
js = DictionaryObject()
js.update(
{
NameObject("/Type"): NameObject("/Action"),
NameObject("/S"): NameObject("/JavaScript"),
NameObject("/JS"): NameObject("(%s)" % javascript),
}
)
js_indirect_object = writer._addObject(js)
# We need a name for parameterized javascript in the pdf file, but it can be anything.
"""
Here is what i have changed
This change is not really necessary but instead of having
a long uuid string it is better to give a meaningful name
"""
# js_string_name = str(uuid.uuid4())
js_string_name = jsName
js_name_tree = DictionaryObject()
js_name_tree.update(
{
NameObject("/JavaScript"): DictionaryObject(
{
NameObject("/Names"): ArrayObject(
[createStringObject(js_string_name), js_indirect_object]
)
}
)
}
)
writer._addObject(js_name_tree)
"""
Here is what i have changed
I only replaced /OpenAction with /JavaScript
This will prevent having 2 global document entries
The /OpenAction creates a Doc Open on top of Document-Level which is unnecessary in our case
"""
writer._root_object.update(
{
# NameObject("/OenAction"): js_indirect_object,
NameObject("/JavaScript"): js_indirect_object,
NameObject("/Names"): js_name_tree,
}
)
return writer
and her is how you call it from your python code
javaScript = "your javascript goes here"
jsName = "the name that you want"
#writer.addJS(javaScript)
addJS(writer, javaScript, jsName)
This will solve your problem as i have tested it and it is 100% working. Good luck coding.
Hi @youssef-jaber, there is a form of adding more than one javascript file.?
When I try to use the .addJS() twice for some reason only keep the last one, I think that at some point it makes a new object with the last js code.
Hi kenca23 Would you please give more explanation to what are actually trying to do ? I am sorry I could not understand what your question is. Thanks
/JavaScript is automatically executed in /Names under /Catalog, you should only use one of /OpenAction and /Names, not both, this is why the #482 bug occurs, /JavaScript is meaningless under /Catalog, so it is completely ignored by pdf. Just removing the line NameObject("/JavaScript"): js_indirect_object, is enough
Fixed by #1439