vscode-jupyter
vscode-jupyter copied to clipboard
Race condition in port forwarding on Remote-Container on WSL
Applies To
- [X] Notebooks (.ipynb files)
- [ ] Interactive Window and/or Cell Scripts (.py files with #%% markers)
What happened?
I'm using Bokeh and Holoviews to render a chart that opens a connection back to a websocket.
A race condition exists between the backend Tornado server starting up and the port being forwarded, to making the first HTTP request. Re-running the cell repeatedly either results in a chart being displayed or no chart being displayed.
VS Code Version
1.71.1
Jupyter Extension Version
v2022.8.1002431955
Jupyter logs
Rather than pasting the Jupyter logs, here are the Developer Console javascript logs:
Startup
INFO [attempt 1] Invoking resolveAuthority(dev-container)
log.ts:301 INFO [attempt 1] resolveAuthority(dev-container) returned '127.0.0.1:35017' after 4479 ms
VM11:4 Registering custom require.js for Jupyter Kernel
eval @ VM11:4
notebookWebviewPreloads.js:3 Notebook preload (https://vscode-remote%2Bdev-002dcontainer-002b633a5c7372635c6a6f686e6e792d726f626f745c4572676f6e2e52657365617263685c656d725c646174615c6175676d656e745c4e42424f.vscode-resource.vscode-cdn.net/home/vscode/.vscode-server/extensions/ms-toolsai.jupyter-2022.8.1002431955/out/node_modules/%40vscode/jupyter-ipywidgets/dist/ipywidgets.js) looks like a module but does not export an activate function
r @ notebookWebviewPreloads.js:3
console.ts:137 [Extension Host] Starting WebSocket: RAW/api/kernels/f1213451-2417-49ab-89f2-71c059f9c919
DevTools failed to load source map: Could not load content for https://vscode-remote+dev-002dcontainer-002b633a5c7372635c6a6f686e6e792d726f626f745c4572676f6e2e52657365617263685c656d725c646174615c6175676d656e745c4e42424f.vscode-resource.vscode-cdn.net/home/vscode/.vscode-server/extensions/ms-toolsai.jupyter-2022.8.1002431955/out/webviews/webview-side/ipywidgetsKernel/ipywidgetsKernel.js.map: Connection error: net::ERR_NAME_NOT_RESOLVED
No chart displayed
VM22:343 [bokeh] setting log level to: 'info'
The FetchEvent for "http://localhost:36281/autoload.js?bokeh-autoload-element=1002&bokeh-absolute-url=http://localhost:36281&resources=none" resulted in a network error response: the promise was rejected.
Promise.then (async)
(anonymous) @ service-worker.js:213
service-worker.js:352 Uncaught (in promise) TypeError: Failed to fetch
at l (service-worker.js:352:11)
l @ service-worker.js:352
VM26:12 GET http://localhost:36281/autoload.js?bokeh-autoload-element=1002&bokeh-absolute-url=http://localhost:36281&resources=none net::ERR_FAILED
(anonymous) @ VM26:12
(anonymous) @ VM26:13
domEval @ index.js:1304
renderHTML @ index.js:1317
renderOutputItem @ index.js:1447
render @ notebookWebviewPreloads.js:3
renderOutputCell @ notebookWebviewPreloads.js:3
await in renderOutputCell (async)
(anonymous) @ notebookWebviewPreloads.js:3
te.outputs.set.queue @ notebookWebviewPreloads.js:3
enqueue @ notebookWebviewPreloads.js:3
(anonymous) @ notebookWebviewPreloads.js:3
postMessage (async)
(anonymous) @ index.html?id=d051151b-29f8-40da-ba40-2da365b2934e&origin=d051151b-29f8-40da-ba40-2da365b2934e&swVersion=4&extensionId=&platform=electron&vscode-resource-base-authority=vscode-resource.vscode-cdn.net&parentOrigin=vscode-file%3A%2F%2Fvscode-app&remoteAuthority=dev-container%2B633a5c7372635c6a6f686e6e792d726f626f745c4572676f6e2e52657365617263685c656d725c646174615c6175676d656e745c4e42424f&purpose=notebookRenderer:1102
HostMessaging.channel.port1.onmessage @ index.html?id=d051151b-29f8-40da-ba40-2da365b2934e&origin=d051151b-29f8-40da-ba40-2da365b2934e&swVersion=4&extensionId=&platform=electron&vscode-resource-base-authority=vscode-resource.vscode-cdn.net&parentOrigin=vscode-file%3A%2F%2Fvscode-app&remoteAuthority=dev-container%2B633a5c7372635c6a6f686e6e792d726f626f745c4572676f6e2e52657365617263685c656d725c646174615c6175676d656e745c4e42424f&purpose=notebookRenderer:295
Chart displays and is interactive
VM22:343 [bokeh] setting log level to: 'info'
VM22:746 [bokeh] Websocket connection 0 is now open
VM22:324 [bokeh] document idle at 69 ms
VM22:322 Bokeh items were rendered successfully
Coding Language and Runtime Version
Python v3.9.13, holoviews 1.15.0, hvplot 0.8.1, bokeh 2.4.3
Language Extension Version (if applicable)
No response
Anaconda Version (if applicable)
No response
Running Jupyter locally or remotely?
Remote
I'm using the simple repro code from https://github.com/microsoft/vscode-jupyter/issues/1714 to test:
import xarray as xr
import hvplot.xarray
import numpy as np
arr = xr.DataArray(
np.random.random((2, 3, 4)),
dims=['x', 'y', 'time'],
coords={'x': np.arange(2), 'y': np.arange(3), 'time': np.arange(4)}
)
test = arr.hvplot(x='x', y = 'y')
Re-run this until it works
import holoviews as hv
renderer = hv.renderer('bokeh')
renderer.app(test, show=True)
Note I also had to set the environment variable: BOKEH_ALLOW_WS_ORIGIN=*
to avoid https://github.com/bokeh/bokeh/issues/10765 and https://github.com/microsoft/vscode-jupyter/issues/4132
Thanks for filing this issue, I'll discuss this with the team and look into this.
At first when I began testing today I wasn't able to reproduce the issue. I was able to re-run the renderer.app
cell repeatedly and always got a chart and no Javascript errors.
Then I switched to the Jupyter Output Window (I had been in a Terminal Window), and the error now occurs every time I run the cell. No chart and net::ERR_FAILED
errors.
Trying to replicate that behavior: starting VS Code with the Terminal visible, and running the cells described above, did not yield what I'd hoped. At this time nothing I do results in a successful chart being displayed. I receive a net::ERR_FAILED
every time I run the second cell.
A race condition exists between the backend Tornado server starting up and the port being forwarded, to making the first HTTP request. Re-running the cell repeatedly either results in a chart being displayed or no chart being displayed.
Do I understand correctly -
- A Tornado server is set up somewhere on the remote
- Automatic port forwarding kicks in to forward that port
- The forwarded port is accessed from the notebook renderer
But the race is in forwarding that port before the renderer needs it? Is that what you think is happening @DonJayamanne?
If you can set a static port, you could probably forward it with forwardPorts
in your devcontainer.json and that could help with the race.
I'll test with a static port.
Notably I'm not getting a connection timeout error as that port comes online. I'm getting this net::ERR_FAILED
in the service worker. I don't have an understanding of how the magic works with the port forwarding, so maybe that's the error that would be expected if no port was open, but usually for an HTTP connection I'd expect some timeout time that would give that port a few hundred milliseconds to arrive.
Today I'm not able to replicate the race condition, so I can't tell if the static port helps... It's not a great long term solution though because there may be many apps created in a notebook and managing the port numbers, shutting down the existing applications, etc would really limit the notebook experience.
The requirement to create an app and bind it to a port in the first place is a workaround for the inability of Bokeh/Holoviews to communicate back through the IPython Proxy or whatever in the first place. The following should "just work", and does on a Jupyter instance running directly in the VS Code Terminal with no additional ports forwarded:
This renders but the interactivity doesn't work in VS Code.
Here's the channel that was created on Jupyter Lab:
It seems like there are a few outstanding issues logged for Bokeh and VS Code. I can create a new one for the above behavior if that would be helpful.
But the race is in forwarding that port before the renderer needs it? Is that what you think is happening @DonJayamanne?
Yes.
The following should "just work", and does on a Jupyter instance running directly in the VS Code Terminal with no additional ports forwarded:
@ddrinka I'm assuming you are still running all of this in WSL. Is that right?
cate the race condition, so I can't tell if the static port helps... It's not a great long term solution
Understood.
@alexr00 Any idea what we can do here. Basically there seems to be a race condition here. The port forwarding seems to happen after the webview attempts to access the port.
If a webview needs the port then the owner of the webview should call asExternalUri
to cause it to be forwarded. Automatic port forwarding is a user facing feature and works by polling, with a potentially non-constant polling frequency depending on the speed of the machine. Because of this, it's not safe to rely on automatic port forwarding for programmatic access to forwarded ports.
@DonJayamanne yes, this is all WSL.
@alexr00 understood. That polling interval sounds exactly like the cause of the trouble.
I'm out of my depth with all these internals but I'm happy to open an issue with Bokeh regarding their webview utilization.
While the experts are looking at this, and if you'll forgive my hijacking, can you provide any advice for Bokeh to solve the proxy communication issue that leads to this requirement for opening additional ports in the first place?
https://github.com/bokeh/bokeh/issues/10765 https://github.com/microsoft/vscode-jupyter/issues/4132
It sounds like work was done for iPyWidgets that would have to be done for Bokeh as well to make it work? https://github.com/microsoft/vscode-jupyter/wiki/Component:-IPyWidgets
With IPyWidgets, there are no custom ports open, IPyWidgets communicate over the Jupyter protocol. I'll dig through the bokeh code and get in touch with their maintainers. My prelimnary suggestion would be to use Jupyter protocol as thats more resilient, else anyone dealing with Jupyter remotely could have similar issues with firewall restrictions and the like.
By default I believe Bokeh does use the Jupyter protocol in a way similar to IPyWidgets. This ticket exists due to my attempt to work around the inability of Bokeh to succeed in its normal communication style by forcing open a new Tornado backend server and using that instead. Which almost works. :p
is ticket exists due to my attempt to work around the inability of Bokeh to succeed in its normal communication style by forcin
Could you provide more information about this, is there an issue on the Boken repo for this (the problem you're trying to work around)? I ask because fixing that root problem would then alleviate this issue.
Those tickets I linked above,
https://github.com/bokeh/bokeh/issues/10765 https://github.com/microsoft/vscode-jupyter/issues/4132
seem to track the base issue.
My prelimnary suggestion would be to use Jupyter protocol as thats more resilient, else anyone dealing with Jupyter remotely could have similar issues with firewall restrictions and the like.
Just for some context, Bokeh (even the current Bokeh server, which came later) well predates JupyterLab and IIRC even predates notebook comms. But more importantly, most usage of Bokeh server is simply not in notebook/jupyter environments at all. So having Jupyter comms be the only, or even the default, transport, is a non-starter. I do very much think that liberating the Bokeh protocol from a particular transport, so that bi-directional Bokeh eventing could easily happen over websocket, or jupyter comms, or whatever else anyone might like, would be fantastic to have happen. But Bokeh is no longer something I get paid to work on, and this is not really an in-your-free-time chunk of work. I am not sure when it might happen without a proper dedication of resources.
I've done enough issue-hijacking. I'll jump over to Bokeh's issue tracking for further conversation about Bokeh and VSCode using the default transport, and leave this issue to track any additional thoughts on the race condition when Bokeh is running in Server mode.
Thanks for your input @bryevdv, appreciate the response here.
Summary for internal (personal notes):
- Not much can be done from VS Code to resolve this issue
- If Bokeh were to use IPYwidgets and kernel Comms messages instead of a custom websocket connection that would address this issue.
- See here for context of the current approach https://github.com/microsoft/vscode-jupyter/issues/11368#issuecomment-1253222327
I'm going to close this as something that cannot be fixed in VS Code as this is specific to bokeh package. However if ther'es something we (Jupyter extension or VS Code) can do to resolve this, please do comment here or create a new issue
Closing this for now.