vscode-jupyter Slow execution in remote ssh

Originally reported in https://twitter.com/bradneuberg/status/1661823674612330496?s=20 , for printing data frame that's not significantly large, it was taking 20+ seconds but in Jupyter Notebook it's instant.

May 26 '23 18:05 rebornix

Since the timer stays on “pending” the whole time, it means we never heard from the EH process when it started executing the code, so it must be a very slow connection or a busy process. Trace-level logs from the Window and Jupyter output channels would help me understand what's going on

May 26 '23 22:05 roblourens

@roblourens one change we made in Jupyter Pre-release is the fact that the timer will now start only after we get a response back from the kernel (i.e. it will still be in pending state even after jupyter extension handles the execution). This was to avoid showing wrong execution times. Not sure what version of Jupyter ext is being used here (stable vs insiders). Yes logs will definitely help here

May 26 '23 23:05 DonJayamanne

We have seen a few of these in the past and thats why we added some logging in the jupyter extension to see when jupyter got the execution callback from VS Code Notebook API, and in the past it was either network or other (in a recent case it was to do with some network drive or the like).

May 26 '23 23:05 DonJayamanne

I can't comment on the DataFrame example from the tweet, but I've noticed similar, extremely sluggish behavior that I can reliably reproduce [VSCode 1.78.2 client on windows, server on Linux, Jupyter extension v2023.4.1011241018] by generating a couple of plots in pyplot in a slow network environment (say, limit up- and downstream to 1024kbps for a formidable demonstration). Somehow, execution of the first cell below is triggering unreasonably large outbound traffic (~10MB). Rendering of the plots seems to run independently, with inbound traffic of ~1MB until rendering is complete. In a network with slow uplink, the execution of the second cell would then noticeably stall until the outbound transfer triggered by the first cell is complete.

# notebook cell 1
import numpy as np
import matplotlib.pyplot as plt

for _ in range(25):
    plt.figure()
    plt.scatter(np.random.normal(size=1000), np.random.normal(size=1000))
    plt.show()

# notebook cell 2
1+1

One more observation: If I draw the plots with a single datapoint plt.scatter(np.random.normal(size=1), np.random.normal(size=1)) the outbound traffic is significantly less, but still much greater compared to when I draw the single datapoint without numpy arrays, plt.scatter([0], [0]).

Hope this helps!

May 27 '23 00:05 fbuessen

Since it's triggered by a bunch of outputs, it's probably the issue we discussed in https://github.com/microsoft/vscode/issues/172345 @rebornix

Jun 02 '23 19:06 roblourens

@fbuessen We have added experimental saving logic for Remote SSH, it would be great if you can give this a try and see if it improves the performance

Install latest VS Code Insiders
Remote SSH into your remote machine
Add "notebook.experimental.remoteSave": true to your Remote/User settings
Reload the window
Then test the scenarios which used to slow or block the network

Thanks in advance!

Jun 29 '23 21:06 rebornix

Hi @rebornix, thanks for looking into this. I followed your instructions to run the most recent VS Code 1.80.0-insider [commit c1bca6d7] on a Windows client, connected to a Linux host through Remote-SSH extension v0.103.2023062115 with Jupyter v2023.6.1001821100. I've added "notebook.experimental.remoteSave": true to both my remote and user level settings.

The settings editor annotates the setting as "Unknown Configuration Setting" -- is this expected or am I not running the correct version?

The behavior when I run my example notebook above is unchanged. When executing the cell

# notebook cell 1
import numpy as np
import matplotlib.pyplot as plt

for _ in range(25):
    plt.figure()
    plt.scatter(np.random.normal(size=1000), np.random.normal(size=1000))
    plt.show()

the output is rendered quickly, but any successive commands (execute next cell or saving the notebook) will stall until the relatively large outbound traffic of ~10MB is complete.

Jun 30 '23 00:06 fbuessen

@fbuessen thanks for your quick response, yes you are on the right version and the warning is fine, we can ignore it. I tried your code snippet and I can reproduce the big payload to the extension host. My hypothesis is we are not updating the image outputs incrementally somehow, to help validate this hypothesis, can you help confirm that, once you have the document saved, and if you try to create a new cell and do a simple print(1) and then save, there is no large outbound traffic?

Jun 30 '23 00:06 rebornix

@rebornix I tried the following execution flow:

Run the first cell to generate the plots [this generates large outbound traffic and subsequent command stall until it's complete]
Save notebook [this does NOT generate significant traffic]
Run a new cell print(1) [this does NOT generate significant traffic]
Save notebook [this does NOT generate significant traffic]

Jun 30 '23 00:06 fbuessen

@fbuessen thanks, that's good to hear. Without the notebook.experimental.remoteSave setting, both step 3 and 4 would generate significant traffic. Thinking about it again, step 1 generates large traffic because:

Kernel sent MBs of images from remote extension host to the renderer process (the desktop client side)
The renderer process then broadcasts output changes to the extension host, which are the new images

The renderer is the source of truth so it will always try to broadcast what's being changed to all available extension hosts (in Remote scenario, there is a local extension host and a remote extension host). To improve this case, we could consider holding reference to the image data when kernels send them from extension host to the renderer process, this way when we broadcast the change back to the extension host, we don't necessarily have to send images again.

It might require careful design as VS Code supports multiple extension hosts, we need to ensure that extension hosts that don't have the image data still get them at the end. Will look into this in July.

Thanks again for helping me verify this!

Jun 30 '23 00:06 rebornix

Same issue here. Long pending time happened in both vscode ssh connection and vscode in windows Remote Desktop while browser Jupyter notebook has no issue

Jul 04 '23 05:07 cqlc94

Same issue here, also when using Interactive window in VS Code cell-mode execution.

@rebornix Why is the renderer the source of truth? If there are data received from the server and the source of truth needs to be local - the client receiving data from the server should be the source of truth, not the renderer. I'm asking as a programmer with no inside knowledge whatsoever of VS Code

Aug 21 '23 14:08 kwikwag

I've also encountered the same issue. It even happens when I'm just importing libraries in the first cell, without any significant image rendering. I'm looking forward to a good solution.

Aug 27 '23 11:08 aqlkzf

@aqlkzf importing torch or some similar packages is generally slow for the first time. From my experience this is slow outside vscode as well, i.e. its not specific to remote notebooks.

Aug 28 '23 00:08 DonJayamanne

Thank you for your response. However, I'm still encountering issues when using Jupyter in VSCode via a remote SSH connection. My primary interest is in leveraging the VSCode plugin ecosystem, particularly Copilot. Yet, the slow import process is a significant hindrance.

Jan 13 '24 06:01 aqlkzf