wezterm icon indicating copy to clipboard operation
wezterm copied to clipboard

[Feature] Resurrection: saving and restoring terminal layout/contents/commands

Open crides opened this issue 1 year ago • 3 comments

Drafting as this is WIP, but want to seek feedback/get discussion going on this.

Currently:

  • content/scrollback saving and restoring works. This is done by having a reverse attribute to CSI converter, so that the saved contents are in "plain text" (other than attributes; just like tmux)
  • basic layout saving/restoring works. The layout is saved to a slightly minimized tree in JSON form.
  • basic shell/cwd/program works. shell, cwd and foreground* program just gets obtained thru existing APIs, and shell and cwd are restored thru pane spawning/splitting; program is just converted back to a shell string and send to the pane. Note: foreground is technically not the right term, as we grab the newest of the first level children of the shell and spawn it back.
  • the whole state is wrapped in a .tar.zstd for compression. each pane's content gets its own file (just like tmux)
  • the save/restore functionalities are exported as lua APIs so users can do whatever they want locally I'm using this for testing (manually saving/restoring):
          { mods = 'CTRL|SHIFT', key = 'S', action = wezterm.action_callback(function(_, _, _)
              wezterm.mux.save_state_to("state_test")
          end) },
          { mods = 'CTRL|SHIFT', key = 'R', action = wezterm.action_callback(function(_, _, _)
              wezterm.mux.restore_state_from("state_test")
          end) },
    
    
  • there's basically 0 error handling. that definitely needs to be improved

This has not been thoroughly tested, and I'll integrate it into my workflow and see how it goes

crides avatar Feb 15 '24 07:02 crides

possibly related:

  • #1949
  • #3237
  • #1326
  • #3091

I remember there should be one more related issue (more about content than layout) but I can't find it (though it's pretty short/less important)

crides avatar Feb 15 '24 07:02 crides

Thanks for the very detailed reply!

but there are also a couple of things that I think might make you frown a little bit, because there's more to reconcile before we can merge in this functionality.

No problem. I would like to do things in a correct way for other users, but first I have some things I would like clarifying in your comment.


In theory, the UmaskSaver that we create at process start time should mask off the group+other bits from the files that we save so you may not strictly need to take any extra steps around that most basic safeguard.

Yep, the state saves as 0600 right now

But when we explicitly want to load things again later, then we need a persistent key that we can use to decrypt it later so there is additional configuration and infrastructure required to deal with that.

Encryption would be nice, but it'd also be nice if the user can easily dump the contents without having wezterm load it first. I'm not sure of a good way to do both yet.

If we don't have a solution for this then we need to ensure that the save/restore functionality is opt-in and that is it very clear to the user that it is not encrypted and that they need to satisfy themselves that it is suitable for their environment.

I'm not sure about the security parts, but right now the feature is totally opt in. Nothing extra will be done unless the user triggers save_state_to/restore_state_from in their config. Given we use lua for configs, I think exposing the saving/restoring API + some timer/cron utilities would be easy and enough to achieve what tmux-resurrect has, instead of wrapping a bunch of stuff behind config options


As for spawning commands, I don't quite understand the problems yet:

  • The command that we spawned into the pane is not necessarily the foreground or youngest process currently running into the pane.

No. But the foreground command running on top of the shell should be. Also, we can't grab the second or deeper children of the shell, because that'll break the spawning relationships

  • Not all panes are local panes with processes (eg: could be a remote ssh session, or a serial port) so querying the process list will never work for those

I'll need to look into those, but in those cases I think we shouldn't restore recursively all the layers? As in, we should just attempt to restore the pane history and connection, but not care about what's running inside of those panes.

  • A pane may be a "local" pane but may be in a different domain (such as an ExecDomain or WslDomain) that has a slightly different way to get launched

I'll need to look into those

  • The command that was spawned could be dangerous! Consider someone launching rm -rf into a pane. Zellij's session resumption prompts the user to confirm that it is safe to re-launch. We could potentially shortcut an unconditional prompt with some basic heuristics.

tmux-resurrect handles this by only allowing default and config whitelisted processes to be restored. I think this is better than just ask for confirmation*. Maybe we can use a combination of whitelisting, and asking for confirmation for non-whitelisted things.

*: because, if I have 50 panes with editors and repls inside some of them, then I'd need to go to every pane to restore them

crides avatar Feb 16 '24 04:02 crides

  • Not all panes are local panes with processes (eg: could be a remote ssh session, or a serial port) so querying the process list will never work for those

I'll need to look into those, but in those cases I think we shouldn't restore recursively all the layers? As in, we should just attempt to restore the pane history and connection, but not care about what's running inside of those panes.

If I have a 3-pane layout, where I have a regular local shell on the left, a shell in a docker container (implemented as an ExecDomain that essentially ran docker exec ...) in the middle, and an ssh session (implemented as an SshDomain that used the internal ssh client to connect to a remote host) in the right pane, there's no reason that the save/restore shouldn't be able to handle those three things in the same.

  • For the local pane case: what was run by the user was the shell, and I think the principle of least surprise is to resume their shell (and show their scrollback up to that point). I think running whatever the most recent command was in the shell (assuming that we figure it out correctly) might be interesting, but would be confusing when that program terminates and it doesn't leave them in their shell
  • For the docker container case, I don't think we can guarantee to see inside the container to understand what it was running purely from process tree introspection, and again, I think just dropping back into the command that was used to spawn the pane is the least surprising and probably most anticipated outcome
  • For the ssh case we definitely cannot see into what it was running, but it is similar to the other two cases where I think the reasonable expectation is to restore the prior scrollback and drop you into a freshly logged in shell there
  • The command that was spawned could be dangerous! Consider someone launching rm -rf into a pane. Zellij's session resumption prompts the user to confirm that it is safe to re-launch. We could potentially shortcut an unconditional prompt with some basic heuristics.

tmux-resurrect handles this by only allowing default and config whitelisted processes to be restored. I think this is better than just ask for confirmation*. Maybe we can use a combination of whitelisting, and asking for confirmation for non-whitelisted things.

*: because, if I have 50 panes with editors and repls inside some of them, then I'd need to go to every pane to restore them

Yeah, I think a combination of an allow list and some reasonable heuristics would make sense for this. That said, I don't think the prompt would be onerous: what I had in mind is that the pane is restored and you will see a simple text prompt in that pane to initiate a restart, so it's not like it would be a series of 50 modals that you had to consider immediately on restore, you would only have to confront it if/when you wanted to.

wez avatar Feb 17 '24 01:02 wez

I don't think security is that important if it's something that can be willingl enabled in Lua.

Destroy666x avatar Mar 06 '24 21:03 Destroy666x

Maybe it could be made secure by storing the data in /tmp and introducing a Lua setting to specify your own command to store/retrieve it to disk. The defaults would be a naive copy, and we could add a warning that this configuration is insecure.

For people who want more security, they can then encrypt their restoration data, filter it as they see fit or use any third party program to handle backups from tempfs in general.

Mikilio avatar Mar 10 '24 15:03 Mikilio