nbconvert icon indicating copy to clipboard operation
nbconvert copied to clipboard

nbconvert timing out on relatively simply notebook

Open tfmark opened this issue 3 years ago • 10 comments

Note: This is with nbconvert 6.2.0 from pypi.

I'm currently tumbling down a rabbit hole trying to convert our Jupyter notebooks into PDFs. Our notebooks hit an endpoint to pull down content, which is then visualised via Plotly charts and/or ipywidgets. The issue I'm facing is getting plotly+ipywidgets to display correctly with nbconvert. Specifically the WebPDFExporter.

I'm now at the point where I'm simply attempting to render a plotly chart as a png inside an ipywidget Tab, but even that is not going well!

I have a trivial example notebook:

import ipywidgets as widgets
import plotly.graph_objects as go

# to be able to render png/svg
# !/opt/conda/bin/python3 -m pip install -U kaleido

fig = go.Figure()

trace = go.Scatter(
    name="trace0",
    x=[0, 1, 2, 3],
    y=[9, 8, 7, 6]
)

fig.add_trace(trace)

# # if there is no plain fig.show() the conversion execution times out?
# fig.show()

# # the output we want
output = widgets.Output()
with output:
    fig.show(renderer="png")
    
tab = widgets.Tab()
tab.set_title(0, "Tab 0")
tab.children = [
    output
]
tab

The notebook I pass to nbconvert: report2.c590e4290aef40cf9ded2375446d071f.executed.zip

python3 -m nbconvert /home/jovyan/notebooks/work/report2.c590e4290aef40cf9ded2375446d071f.executed.ipynb --to webpdf --disable-chromium-sandbox

[NbConvertApp] Converting notebook /home/jovyan/notebooks/work/report2.c590e4290aef40cf9ded2375446d071f.executed.ipynb to webpdf
[NbConvertApp] Building PDF
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
<snip>
  File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/webpdf.py", line 91, in main
    await page.goto(f'file://{temp_file.name}', waitUntil='networkidle0')
  File "/opt/conda/lib/python3.8/site-packages/pyppeteer/page.py", line 885, in goto
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.

Running WebPDFExporter manually with debug logging shows that there is no networkidle0, just an networkAlmostIdle

2021-10-12 14:56:08.389 DEBUG protocol.py read_frame client < Frame(fin=True, opcode=<Opcode.TEXT: 1>, data=b'{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"0BEF6AF57086628BB65FE86E69B76182","message":"{\\"method\\":\\"Page.lifecycleEvent\\",\\"params\\":{\\"frameId\\":\\"64511DC4B44A96EDA32CEA1B0248F47C\\",\\"loaderId\\":\\"B74A728769BED0DF5061A79487562A16\\",\\"name\\":\\"firstMeaningfulPaint\\",\\"timestamp\\":27737.527164}}","targetId":"64511DC4B44A96EDA32CEA1B0248F47C"}}', rsv1=False, rsv2=False, rsv3=False)
[D:pyppeteer.connection.Connection] RECV: {"method":"Target.receivedMessageFromTarget","params":{"sessionId":"0BEF6AF57086628BB65FE86E69B76182","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"64511DC4B44A96EDA32CEA1B0248F47C\",\"loaderId\":\"B74A728769BED0DF5061A79487562A16\",\"name\":\"networkAlmostIdle\",\"timestamp\":27736.87627}}","targetId":"64511DC4B44A96EDA32CEA1B0248F47C"}}
[D:pyppeteer.connection.CDPSession] RECV: {"method":"Page.lifecycleEvent","params":{"frameId":"64511DC4B44A96EDA32CEA1B0248F47C","loaderId":"B74A728769BED0DF5061A79487562A16","name":"networkAlmostIdle","timestamp":27736.87627}}
[D:pyppeteer.connection.Connection] RECV: {"method":"Target.receivedMessageFromTarget","params":{"sessionId":"0BEF6AF57086628BB65FE86E69B76182","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"64511DC4B44A96EDA32CEA1B0248F47C\",\"loaderId\":\"B74A728769BED0DF5061A79487562A16\",\"name\":\"firstMeaningfulPaint\",\"timestamp\":27737.527164}}","targetId":"64511DC4B44A96EDA32CEA1B0248F47C"}}
[D:pyppeteer.connection.CDPSession] RECV: {"method":"Page.lifecycleEvent","params":{"frameId":"64511DC4B44A96EDA32CEA1B0248F47C","loaderId":"B74A728769BED0DF5061A79487562A16","name":"firstMeaningfulPaint","timestamp":27737.527164}}

The reason I'm trying to simply display a PNG is that when I use fig.show() I get complete junk output:

image

Ultimately I want to be able to reuse the same 'widgets' (plotly charts wrapped up in an ipywidget Tab) for the interactive notebooks as well as PDF reports. It doesn't work out-of-the-box, but I'm struggling to get a simple example working that I can build upon. Any help would be greatly appreciated!

tfmark avatar Oct 12 '21 15:10 tfmark

pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.

You need to increase --ExecutePreprocessor.timeout, see nbconvert docu. Default is 30 sec, it seems like one of your code cells needs longer to execute. Use -1 to disable cell run time limitation, see here

d-kleine avatar Jun 02 '22 14:06 d-kleine

I thought the timeout is disabled by default? See #791?

mgeier avatar Jun 02 '22 20:06 mgeier

Well, given above link from the docs, there is different information:

The timeout traitlet defines the maximum time (in seconds) each notebook cell is allowed to run, if the execution takes longer an exception will be raised. The default is 30 s, so in cases of long-running cells you may want to specify an higher value. The timeout option can also be set to None or -1 to remove any restriction on execution time.

d-kleine avatar Jun 02 '22 21:06 d-kleine

It's not a long running cell - the cell is very simple - I've provided an example that exhibits this behaviour in the OP.

tfmark avatar Jun 06 '22 10:06 tfmark

I have tried to run your code, but it seems to run in an infinite loop at

with output:
    fig.show(renderer="png")

That would explain why the notebook conversion is timing out for your code. I have used your provided code given above just up to fig.show(), and voilà, the file will be processed: report2.c590e4290aef40cf9ded2375446d071f.executed.pdf

I have used this command: jupyter nbconvert report2.c590e4290aef40cf9ded2375446d071f.executed.ipynb --to webpdf --disable-chromium-sandbox --allow-chromium-download, see here

I am not very versed with plotly, but what wonders me is that you are trying to show the output as a png: png is actually an image format, it should not have any effect on how you display the image. Usually you use a format for writing an output to a file. Here is also no file format when using widgets.Output(). And here you can see that the file is written:

fig.to_image(format="png", engine="kaleido")

So I think the issue here is not nbconvert.

Kind regards, DK

d-kleine avatar Jun 07 '22 06:06 d-kleine

Thanks for looking into this @d-kleine -- the reason that I wanted plotly to output a png is because their default output (js+svg) is not handled by nbconvert (as mentioned in the OP, also see https://github.com/jupyter/nbconvert/issues/1657) and I was trying to see if I could get any output.

One of my colleagues managed to workaround the issue (1657) by using beautifulsoup to manipulate the html generated so that the <script> was pulled out into a different cell... but it was very much a hack, not a fix.

tfmark avatar Jun 07 '22 07:06 tfmark

Have you seen this post on StackOverflow?

I was able to save your code and its output in a html file with this command: jupyter nbconvert --execute --to html nbconvert.ipynb

with following change in the code (idk if this is crucial, but it worked for me):

fig.show(renderer="notebook")

see here

d-kleine avatar Jun 07 '22 11:06 d-kleine

As mentioned, the high level problem was taking a Jupyter Notebook and converting to PDF so that what was displayed when running the report manually in the browser was exactly the same as what was could be generated programatically via nbconvert. This problem was solved by hacking the html generated.

As part of trying to get nbconvert to render a plotly chart inside an ipywidgets tab, I encountered this timeout, hence raising this issue - which you have also been able to see "but it seems to run in an infinite loop at..." - is that a bug or is that the expected behaviour of nbconvert?

Ultimately our business-level problem (convert a jupyter notebook into a pdf) was solved via hacks, but I feel that #1657 ought to "Just Work", and I don't have much of an opinion of this particular Issue (#1656) so if you want to close it, go for it.

tfmark avatar Jun 07 '22 11:06 tfmark

As far as could see from my research, the timeout comes from the module pyppeteer, see here:

timeout (int|float): maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass 0 to disable timeout.

I have found this as well.

d-kleine avatar Jun 07 '22 21:06 d-kleine

Per the OP, I was testing with 6.2.0 which included the change to use a temp file (6.1)... I guess the question is (again, from the OP), why is the network never detected as being fully idle...

tfmark avatar Jun 08 '22 06:06 tfmark