jupyterlab-lsp Slow startup on a cloud-hosted Hub

Description

We (well, @yuvipanda and team did the hard work) installed the LSP tools for a cloud-hosted hub I use to teach a ~1,200 student course this term. Unfortunately Yuvi and team found performance issues - this commit describes slow startup times.

In regular usage, I also noticed tab-completion being slower than usual, to the point where it interferes with the typing workflow (to save typing, tab completion really needs to keep up with ~ 70 wpm speeds, which means its latency cap should be very, very tight).

Eventually we disabled it as for us, smooth performance of the hub is critical. Yuvi may have more details perhaps.

I'm sorry not to have more specifics, but at least flagging it here for the team. In the long run I hope we'll be able to deploy this in production everywhere, as I'd love to teach students with these enhanced capabilities.

Thanks for the great work, btw!

cc @ellisonbg

Sep 10 '22 07:09 fperez

Thanks for opening this. Do you have any reason to believe that the performance issues are related to the number of users in the deployment, or do you think that detail is unrelated to the root cause?

Sep 10 '22 20:09 ellisonbg

This previous issue seems to be describing the same problem: https://github.com/jupyter-lsp/jupyterlab-lsp/issues/796.

I think that the commit description is right: we are checking for installed language servers which takes time. It might take even more on Windows due to the expansion of the search path included in https://github.com/jupyter-lsp/jupyterlab-lsp/pull/731 (unreleased yet).

@yuvipanda, did you try to disable auto-detection of language servers with autodetect traitlet (docs) and specifying configuration of the language servers relevant to your course (I guess a Python LSP server + maybe markdown, but likely not all three Python language servers and a dozen of others more) via config files? The logic is in:

https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L122-L139

You might want to open a PR by changing the autodetect traitlet to Union([Bool, Set]) and if a set is provided only attempt to autodetect the servers from the list provided which would save you from adding custom configuration for those servers.

The autodetection logic is in: https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L228-L288

While profiling would be needed, my educated guess is that:

few % is lost on entrypoints
much more is lost on server schema validation; some good caching would solve the issue but it may not help if you are booting from fresh image; maybe we could allow to disable the validation but that should be flagged as dangerous since it can lead to runtime errors if your specification gets modified later on.
as much or more time is spent on invoking is_installed check; edit: one could review the is_installed scripts to see if we can accelerate those.

Further, reworking these two lines could reduce the startup time penalty by up to 50% (probably less):

https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L119-L120

Since we could first find all servers and then filter that list to installed servers. Pull requests welcome :)

I answered on the completer topic in: https://github.com/jupyter-lsp/jupyterlab-lsp/issues/566

Sep 10 '22 21:09 krassowski

@ellisonbg don't think it's related to the number of users, they're fairly well isolated

@krassowski we did not turn it off, I think that would have helped with startup time for sure! Not sure it would have helped with the user lag tho. When we re-enable it,finding a way to constrain the servers it looks for seems the way to go. Alternatively, it would be nice if looking for servers doesn't block jupyter_server startup!

Sep 10 '22 22:09 yuvipanda

> Alternatively, it would be nice if looking for servers doesn't block jupyter_server startup!

Oh yes, that would be ideal. We should make it async but from a quick skim, it would be a potentially breaking change for the next major version, since frontends do not know that they need to await for initialisation and we only retry the relevant endpoint twice 10 seconds apart (so it should work on most machines but no guarantees):

https://github.com/jupyter-lsp/jupyterlab-lsp/blob/2a524df12da74227bd09316fe339f3a62d66aef2/packages/jupyterlab-lsp/src/manager.ts#L35-L36

And we would needed to communicate specs collection errors on client side too.

Sep 11 '22 10:09 krassowski

Thanks for the feedback :heart_eyes: !

Thanks for reporting this rather than forking, or just badmouthing it elsewhere. As you note, a lot of folk have a lot of opinions on how this stuff should work for their specific use case, and for the most part, we've had to optimize for the single-user, multiple-launch experience. I've found "just" and "need" to come up very frequently, which have become some of my least favorite words in working on FOSS.

There are a ton of things at a managed hub level that would make this experience more fun, especially #184. I'd really love more sourcegraph-like features to work out of the box... but more about that later.

At any rate, lazy loading servers would move this from server startup to (probably) first opening of any document, as we wouldn't know in advance which mimetypes had servers.

Some thoughts:

all of the discovery and loading currently happens synchronously in LanguageServerManager.initialize
- this could be deferred until the event loop has started (but before it was requested), and then be pushed to a thread, as most of those things are blocking
- then the LanguageServersHandler.get could return a new version: 3 response which adds is_complete and logs
  - on the happy path, a client could "unblock" if the server it needs is ready when first requested
  - the "sad path" of not knowing would wait until everything discovery technique failed
slow autocomplete is almost entirely driven by the language server in question
- for some language servers there are some tricks can be moved to the image build or postBuild equivalent
  - for jedi-based language servers, this script prepopulates the underlying cache
    - this cache is not really in a documented format, and is not appropriate for caching/packaging anyway
- longer term, another fix is to build/package offline responses (#184)
  - this would be a whole separate set of executable configurations, as these almost never seem to be integrated directly into language servers
  - sadly, for python, the sourcegraph one has been been deprecated in favor of... their new "standard", the protobuf-based SCIP (cue XKCD link)
    - I cannot recommend adding this dependency to the jupyter stack, unless we move the whole shooting match there :stuck_out_tongue_closed_eyes:
    - however, this kind of mirrors the whole experience I've had with the "community-based" LSP experience

At any rate, I'll take a look...

Sep 11 '22 14:09 bollwyvl

@fperez @yuvipanda we released jupyterlab-lsp 4.0 which includes https://github.com/jupyter-lsp/jupyterlab-lsp/pull/882. Would you like to give it another try and see if the startup time is satisfactory now?

Mar 21 '23 14:03 krassowski

I just did so last night @krassowski! So far it's looking good, many thanks - we'll keep you posted. It feels a lot better in my lightweight testing (and I just added your suggested TeX support with texlab/tectonic, which I'm falling in love with!!

Mar 21 '23 17:03 fperez

jupyterlab-lsp jupyterlab-lsp copied to clipboard

Slow startup on a cloud-hosted Hub

Description

jupyterlab-lsp
jupyterlab-lsp copied to clipboard