Pluto.jl
Pluto.jl copied to clipboard
Pluto package manager becomes unable to update packages when terminal session ends
Situation:
Pluto is running on an AWS EC2 instance, which we connect to remotely. When I start the server, I can update packages in the notebook using the built-in package manager just fine. The next day, packages no longer update until Pluto is restarted. It may not detect that an update in the registry is available, or if you do try to update, the notebook hangs with the busy indicator running indefinitely.
Hypotheses so far:
- Disconnecting from the SSH terminal session that started Pluto is causing something to fail?
Versions affected: At least 0.19.4, 0.19.5
May also be relevant that I'm using a private registry in addition to the general registry.
Before going on a wild-goose chase, one quick question. Are you sure that connecting to the outside world is still possible apart from SSH? So does ping 1.1.1.1
work for example? I've had it before that remote instances reset networking in the night causing it to fail
Good to check. Pinging seems to be fine. I only restart the Julia process and run Pluto though, I presume that wouldn't fix an underlying network reset?
Can you make a video recording? Do you have access to Pluto's logs? You can run Pluto with
import Pluto; ENV["JULIA_DEBUG"] = Pluto; Pluto.run()
to get internal logs.
It's not very exciting, but I can do one next time I notice it, probably by tomorrow.
Ah, I've been trying to find a way to get the logs out. Thanks
Oh, but that only works if it was active beforehand? Does Pluto log the normal terminal output somewhere?
My code snippet above just tells Pluto to show even more log messages, they still show up in the default place ("terminal where you launched Pluto"), so it depends on your setup how to read them. e.g. if the Pluto server is a systemctl
job then journalctl
can show the logs. If it's a tmux pane, just attach to it.
Currently I'm just running a script to run Julia with Pluto in the background, was looking for something more sophisticated though.
But if my hypothesis is anywhere near correct, using a system process as described would potentially prevent the issue from showing up at all.
In fact, I can reproduce it reliably:
julia -e "using Pluto; Pluto.run(Pluto.ServerSession(; secret = \"mysecret\", options = Pluto.Configuration.from_flat_kwargs(; launch_browser = false)))" > pluto.log &
(In other terminal window) ssh [email protected] -NL 1234:localhost:1234
**** Here: open new notebook and type `using NearestNeighbors`, it loads fine after a few seconds ****
exit
**** Here: now add `using Colors`, it hangs after running for ~60 microseconds. ****
Worth saying that doing anything other than trigger the package manager is still functional. Once the package manager is triggered, then you may not be able to do anything else.
This might be a bug in Pkg, maybe related to us passing in a custom IO
to capture Pkg terminal output... Well found @BioTurboNick !
I noticed the > pluto.log
in your snippet. Do these log messages show any errors? It would be great if you could somehow send an interrupt to the background pluto process after it gets stuck, because this would throw an InterruptException
in exactly the place where it is stuck, and the stack trace in the logs would tell us where to look.
Unfortunately that redirect is spotty. Sometimes the log has details, other times it's empty. I'm not great at Linux shell so maybe I'm not doing something right. 🙃
Okay, I've narrowed it down I think -
- Redirect stdout only : problem
- Redirect stderr only : no problem
- Redirect both stdout and stderr : no problem
Unfortunately, the information I'd need is apparently in stderr. And it seems anything I do to see it prevents the issue from occurring.