jupyterlab-lsp
jupyterlab-lsp copied to clipboard
Slow startup on a cloud-hosted Hub
Description
We (well, @yuvipanda and team did the hard work) installed the LSP tools for a cloud-hosted hub I use to teach a ~1,200 student course this term. Unfortunately Yuvi and team found performance issues - this commit describes slow startup times.
In regular usage, I also noticed tab-completion being slower than usual, to the point where it interferes with the typing workflow (to save typing, tab completion really needs to keep up with ~ 70 wpm speeds, which means its latency cap should be very, very tight).
Eventually we disabled it as for us, smooth performance of the hub is critical. Yuvi may have more details perhaps.
I'm sorry not to have more specifics, but at least flagging it here for the team. In the long run I hope we'll be able to deploy this in production everywhere, as I'd love to teach students with these enhanced capabilities.
Thanks for the great work, btw!
cc @ellisonbg
Thanks for opening this. Do you have any reason to believe that the performance issues are related to the number of users in the deployment, or do you think that detail is unrelated to the root cause?
This previous issue seems to be describing the same problem: https://github.com/jupyter-lsp/jupyterlab-lsp/issues/796.
I think that the commit description is right: we are checking for installed language servers which takes time. It might take even more on Windows due to the expansion of the search path included in https://github.com/jupyter-lsp/jupyterlab-lsp/pull/731 (unreleased yet).
@yuvipanda, did you try to disable auto-detection of language servers with autodetect
traitlet (docs) and specifying configuration of the language servers relevant to your course (I guess a Python LSP server + maybe markdown, but likely not all three Python language servers and a dozen of others more) via config files? The logic is in:
https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L122-L139
You might want to open a PR by changing the autodetect traitlet to Union([Bool, Set])
and if a set is provided only attempt to autodetect the servers from the list provided which would save you from adding custom configuration for those servers.
The autodetection logic is in: https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L228-L288
While profiling would be needed, my educated guess is that:
- few % is lost on entrypoints
- much more is lost on server schema validation; some good caching would solve the issue but it may not help if you are booting from fresh image; maybe we could allow to disable the validation but that should be flagged as dangerous since it can lead to runtime errors if your specification gets modified later on.
- as much or more time is spent on invoking
is_installed
check; edit: one could review theis_installed
scripts to see if we can accelerate those.
Further, reworking these two lines could reduce the startup time penalty by up to 50% (probably less):
https://github.com/jupyter-lsp/jupyterlab-lsp/blob/b895b4ee1375c3d5f7d209ce36b4877d1185cde4/python_packages/jupyter_lsp/jupyter_lsp/manager.py#L119-L120
Since we could first find all servers and then filter that list to installed servers. Pull requests welcome :)
I answered on the completer topic in: https://github.com/jupyter-lsp/jupyterlab-lsp/issues/566
@ellisonbg don't think it's related to the number of users, they're fairly well isolated
@krassowski we did not turn it off, I think that would have helped with startup time for sure! Not sure it would have helped with the user lag tho. When we re-enable it,finding a way to constrain the servers it looks for seems the way to go. Alternatively, it would be nice if looking for servers doesn't block jupyter_server startup!
> Alternatively, it would be nice if looking for servers doesn't block jupyter_server startup!
Oh yes, that would be ideal. We should make it async but from a quick skim, it would be a potentially breaking change for the next major version, since frontends do not know that they need to await for initialisation and we only retry the relevant endpoint twice 10 seconds apart (so it should work on most machines but no guarantees):
https://github.com/jupyter-lsp/jupyterlab-lsp/blob/2a524df12da74227bd09316fe339f3a62d66aef2/packages/jupyterlab-lsp/src/manager.ts#L35-L36
And we would needed to communicate specs collection errors on client side too.
Thanks for the feedback :heart_eyes: !
Thanks for reporting this rather than forking, or just badmouthing it elsewhere. As you note, a lot of folk have a lot of opinions on how this stuff should work for their specific use case, and for the most part, we've had to optimize for the single-user, multiple-launch experience. I've found "just" and "need" to come up very frequently, which have become some of my least favorite words in working on FOSS.
There are a ton of things at a managed hub level that would make this experience more fun, especially #184. I'd really love more sourcegraph-like features to work out of the box... but more about that later.
At any rate, lazy loading servers would move this from server startup to (probably) first opening of any document, as we wouldn't know in advance which mimetypes had servers.
Some thoughts:
- all of the discovery and loading currently happens synchronously in
LanguageServerManager.initialize
- this could be deferred until the event loop has started (but before it was requested), and then be pushed to a thread, as most of those things are blocking
- then the
LanguageServersHandler.get
could return a newversion: 3
response which addsis_complete
andlogs
- on the happy path, a client could "unblock" if the server it needs is ready when first requested
- the "sad path" of not knowing would wait until everything discovery technique failed
- slow autocomplete is almost entirely driven by the language server in question
- for some language servers there are some tricks can be moved to the image build or
postBuild
equivalent- for
jedi
-based language servers, this script prepopulates the underlying cache- this cache is not really in a documented format, and is not appropriate for caching/packaging anyway
- for
- longer term, another fix is to build/package offline responses (#184)
- this would be a whole separate set of executable configurations, as these almost never seem to be integrated directly into language servers
- sadly, for python, the sourcegraph one has been been deprecated in favor of... their new "standard", the protobuf-based SCIP (cue XKCD link)
- I cannot recommend adding this dependency to the jupyter stack, unless we move the whole shooting match there :stuck_out_tongue_closed_eyes:
- however, this kind of mirrors the whole experience I've had with the "community-based" LSP experience
- for some language servers there are some tricks can be moved to the image build or
At any rate, I'll take a look...
@fperez @yuvipanda we released jupyterlab-lsp 4.0 which includes https://github.com/jupyter-lsp/jupyterlab-lsp/pull/882. Would you like to give it another try and see if the startup time is satisfactory now?
I just did so last night @krassowski! So far it's looking good, many thanks - we'll keep you posted. It feels a lot better in my lightweight testing (and I just added your suggested TeX support with texlab/tectonic, which I'm falling in love with!!