RemoteREPL.jl icon indicating copy to clipboard operation
RemoteREPL.jl copied to clipboard

Support shared sessions

Open filchristou opened this issue 1 year ago • 5 comments

Redesign for https://github.com/c42f/RemoteREPL.jl/pull/45

Novelties and summary

In #45 we discussed that it would be better to use the underlying protocol to change the module. It became clear that this is easier done if shared sessions were supported. So instead of directly solving my problem I tried to introduce multiple shared sessions. As a result, module update for Pluto is now also more straightforward.

1. session_id

Basically I introduced a session_id that characterizes every ServerSideSession. The server holds all the sessions in a Dict{UUID, ServerSideSession}. Everytime a client connects, it send the session UUID it wants to connect to as the first thing.

2. SessionSideSession

Now SessionSideSession is not described by a single Socket but a vector of Sockets, which is kept up to date.

3. @remoterepl

I need a way to still connect remotely and change the module, i.e. %module ..., without having a REPL available (like Pluto). @remote support only command execution. Therefore I introduced @remoterepl which gives access to the REPL magic commands for all clients. The question is.. why not collapse @remote to @remoterepl, since the second is a superset ?

4. Structural changes

I put all the using dependencies on the src/RemoteREPL.jl file. I find it much more readable this way and it also follows community conventions.

Pluto side

Pluto notebook should look something like this gist. I am not sure how to avoid the updating every second, but don't focus on that; this is a Pluto issue/discussion. Basically Pluto just uses @remoterepl to upgrade the module in a specific session. The core code boils down to

server = Sockets.listen(Sockets.localhost, 27765)

@async serve_repl(server)

session_id = UUID("f03aec15-3e14-4d58-bcfa-82f8d33c9f9a")

con2server = connect_remote(Sockets.localhost, 27765; session_id=session_id)

# define function to get more recent module
takemodulesymbol() = Symbol("workspace#" ,PlutoRunner.moduleworkspace_count[])

# get module
mod = eval(takemodulesymbol())

# use `@remoterepl` to update module
@eval(@remoterepl $"%module $mod")

Test

Run the pluto notebook. Open a julia kernel, add dependencies and do:

julia> using RemoteREPL, Sockets, UUIDs

julia> connect_repl(Sockets.localhost, 27765; session_id=UUID("f03aec15-3e14-4d58-bcfa-82f8d33c9f9a"))
[ Info: Using session id f03aec15-3e14-4d58-bcfa-82f8d33c9f9a
REPL mode remote_repl initialized. Press > to enter and backspace to exit.
"Prompt(\"julia@localhost:27765> \",...)"

julia@localhost:27765> temp
1

julia@localhost:27765> @__MODULE__
Main.var"workspace#1261"

julia@localhost:27765> @__MODULE__
Main.var"workspace#1265"

julia> # uncomment "# anewvar = 2"

julia> @remote(anewvar)
2

As you can see the remote client closely follows the changes in the Pluto notebook.

Let me know how that looks and I am happy to hear your thoughts!

filchristou avatar Jun 21 '23 18:06 filchristou

Some summary for the force-push:

I generally agree with your comments, so I removed @remoterepl and I overloaded run_remote_repl_command

run_remote_repl_command

I think it's better to keep the same name and not unnecessarily introduce breaking changes. Now it's a bit more welcoming to use as I exported it and added docs. Because of the argument order, we need to manually overload by defining the other two methods. However everything could be simplified to a single function signature like the following:

function run_remote_repl_command(cmdstr::String, conn::Connection=_repl_client_connection, out_stream::IO=Base.stdout)

but it would be breaking.

Maybe we could make a list of desired breaking changes and include this one (?)

mutex

I implemented the design as shown in https://github.com/c42f/RemoteREPL.jl/pull/55#discussion_r1249529472 Let me know if you have a different implementation in mind.

remote_module!

I also implemented the RemoteREPL.remote_module! function you were saying. The function is exported. The Pluto notebook now also uses remote_module! instead.

docs/Manifest.toml

Currently CI.yml Pkg.develops the current directory. Since the Manifest.toml is already checked in, we can have dev .. there. This is also easier for local docs generation, where the CI.yml is unusable. I also added some docs for session_id and the workflow with Pluto.

tests

It would be nice to develop some tests about the sessions sharing, but currently nothing is implemented. If you don't consider it urgent, maybe in a following PR ?

filchristou avatar Jul 02 '23 16:07 filchristou

Because of the argument order, we need to manually overload by defining the other two methods

No problems. I like the current argument ordering anyway by analogy to other functions like show etc which take the IO as an initial argument. And also I feel like the Connection is the "primary argument" so should go first.

but it would be breaking.

Well run_remote_repl_command is not documented anywhere as part of the public API. So I think it would be ok to rename it. Also it would probably be ok to just release version 1.0 if we were worried. This package seems somewhat stable at this point :woman_shrugging:

[Tests] If you don't consider it urgent, maybe in a following PR ?

Mmm I'd prefer tests added as part of the PR. Tests can kind of be a pain but they're so essential in the long run. It's easier to write them when the new stuff is fresh in mind and to be motivated to do it if you want to use the feature they are testing.

Testing shouldn't need to be super extensive for this feature. Just a few tens of lines should be enough and the client-server setup already exists in runtests.jl.

c42f avatar Jul 03 '23 05:07 c42f

It would probably be ok to just release version 1.0 if we were worried.

before going to 1.0, it would be nice to talk a bit about https://github.com/c42f/RemoteREPL.jl/issues/56

Update summary

  1. Renamed run_remote_repl_command to remotecmd
  2. Added some further docs
  3. reordered remote_module! function's arguments to resemble to remotecmd etc.
  4. Added tests. Also the test server now uses TestEnv, because otherwise the registry version of RemoteREPL is used rather the developed https://github.com/c42f/RemoteREPL.jl/pull/55/commits/d9b836925d5384e2450b5da77612bb39f460d316#diff-3b9314a6f9f2d7eec1d0ef69fa76cfabafdbe6d0df923768f9ec32f27a249c63R117 I think I successfully added TestEnv in the dependencies of the GHAction-julia, and shouldn't be a problem https://github.com/c42f/RemoteREPL.jl/pull/55/commits/d9b836925d5384e2450b5da77612bb39f460d316#diff-3ab46ee209a127470fce3c2cf106b1a1dbadbb929a4b5b13656a4bc4ce19c0b8R76 Could you try running the test please ?

filchristou avatar Jul 03 '23 12:07 filchristou

@c42f Hi Claire. If there are no issues with the PR, could you please merge it and maybe make a new release ?

filchristou avatar Nov 09 '23 07:11 filchristou

@c42f Are there any blocking points? Do you think we can merge that soon?

filchristou avatar Feb 26 '24 11:02 filchristou

I've fixed the conflicts by rebasing and pushing to the pull request branch.

That also seems to have triggered CI. Can you have a look at what's gone wrong there?

c42f avatar Jul 16 '24 04:07 c42f

ouf. that's gonna be hard. Locally using TestEnv I get the tests passed, but with the ] test something fails with a different error from the CI. Also sometimes (mostly when I exit the kernel) I get a non fatal error printing the following

error in running finalizer: Base.IOError(msg="write: broken pipe (EPIPE)", code=-32)
uv_write at ./stream.jl:1066
unsafe_write at ./stream.jl:1120
write at ./strings/io.jl:248 [inlined]
writeheader at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:703
serialize at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:776
unknown function (ip: 0x7fa575ae73b9)
_jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077
close at /u/home/wima/fchrstou/GitProjects/RemoteREPL.jl/src/client.jl:166
unknown function (ip: 0x7fa575ad01e5)
_jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077
run_finalizer at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gc.c:318
jl_gc_run_finalizers_in_list at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gc.c:408
run_finalizers at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gc.c:454
ijl_atexit_hook at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/init.c:299
jl_repl_entrypoint at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/jlapi.c:732
main at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/cli/loader_exe.c:58
unknown function (ip: 0x7fa581eb1249)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)

I will have to look closer to that when I find some time. Btw thanks for making me a collaborator :)

filchristou avatar Jul 16 '24 21:07 filchristou

Oof. I suspect I've seen that finalizer error before. Probably not your fault! It's hard to clean up the ssh tunnel cleanly when the process exists. (Unless there's a way to hook the REPL shutdown more cleanly?)

This is a relatively hard package to test.

c42f avatar Jul 17 '24 05:07 c42f

As said. The problem was very obscure and I still don't understand what was the issue. include("test/runtests.jl") was working but ] test was not. I just removed all the TestEnv stuff and now it seems that the tests consistently pass. @c42f can we merge ?

filchristou avatar Jul 23 '24 14:07 filchristou

@c42f pushed an update. magic protocol is checked first now.

filchristou avatar Jul 26 '24 13:07 filchristou

Should we make a v0.3.0 release ? If I have permissions and it's fine by you I can make one.

filchristou avatar Jul 29 '24 09:07 filchristou

This PR seems to have broken the SSH tunnel tests :-(

They work for me before this PR, but not after.

I can't remember exactly, but I think the SSH tests are not enabled in CI because getting the SSH server working in the CI environment wasn't finished.

c42f avatar Jul 30 '24 03:07 c42f