wezterm icon indicating copy to clipboard operation
wezterm copied to clipboard

SSH agent forwarding doesn't work with `unix_domains`

Open maddiemort opened this issue 3 years ago • 9 comments

What Operating System(s) are you seeing this problem on?

Linux X11, macOS

WezTerm version

20220101-133340-7edc5b5a

Did you try the latest nightly build to see if the issue is better (or worse!) than your current version?

No, and I'll explain why below

Describe the bug

I tried to use the information in #1568 to connect to a remote wezterm-mux-server in an attempt to get SSH agent forwarding working in a wezterm unix domain. While agent forwarding works through a normal ssh -A hostname command, when using wezterm connect hostname, it doesn't - even though a socket is created on the remote host like it is with ssh -A, $SSH_AUTH_SOCK is not set. Manually running export SSH_AUTH_SOCK=/path/to/socket gets agent forwarding working, but this is not a solution long-term.

This issue is a continuation of my comments on #1568.

To Reproduce

I have wezterm 20220101-133340-7edc5b5a installed on rigel, which is the client machine running macOS 12.0.1 (21A559) and configured through nix-darwin. The remote host, talitha, is a NixOS 21.11.20220210.7adc9c1 (Porcupine) machine also running wezterm 20220101-133340-7edc5b5a.

The client is running ssh-agent from OpenSSH 8.8p1, and the agent has an ed25519-sk key loaded, which is resident on a Yubikey. First, to test that SSH agent forwarding is working correctly when using ssh normally, I can run ssh -A talitha. This causes the Yubikey to flash, prompting me to touch it, at which point the SSH connection succeeds. Then running ssh -T [email protected] on talitha through the SSH connection also causes the Yubikey to flash, and when I touch it, the connection succeeds. Running echo $SSH_AUTH_SOCK on talitha reports, e.g. /tmp/ssh-XXXXjYEJ5n/agent.235345 (this changes on each new connection), and ssh-add -L reports the correct key is available.

Then I added the unix_domains section I've provided below to my ~/.config/wezterm/wezterm.lua (from #1568) and reloaded my configuration. wezterm connect talitha does the following things:

  • Prints on the client:
    2022-02-19T16:58:00.614Z INFO  wezterm_gui::termwindow > OpenGL initialized! AMD Radeon Pro 5500M OpenGL Engine 4.1 ATI-4.7.29 is_context_loss_possible=false wezterm version: 20220101-133340-7edc5b5a
    
  • Causes the Yubikey to flash, prompting me to touch it
  • Opens a window with the following contents:
    Connect to Proxy(["ssh", "-T", "-A", "talitha", "wezterm", "cli", "proxy"])
    Connected!
    Checking server version
    

When I touch the Yubikey, it then:

  • Prints this line again, on the client:
    2022-02-19T16:58:01.407Z INFO  wezterm_gui::termwindow > OpenGL initialized! AMD Radeon Pro 5500M OpenGL Engine 4.1 ATI-4.7.29 is_context_loss_possible=false wezterm version: 20220101-133340-7edc5b5a
    
  • Successfully connects to talitha, opening a prompt in the new window

Now, running ssh -T [email protected] on talitha through the SSH connection does not cause the Yubikey to flash and immediately prints the following:

Confirm user presence for key ED25519-SK SHA256:<hash redacted>
sign_and_send_pubkey: signing failed for ED25519-SK "/home/soren/.ssh/id_ed25519_sk": invalid format
[email protected]: Permission denied (publickey).

Running echo $SSH_AUTH_SOCK on talitha produces an empty output, and ssh-add -L prints:

Could not open a connection to your authentication agent.

If I then run echo /tmp/ssh*/* to determine the location of the forwarded SSH socket, and then manually set $SSH_AUTH_SOCK:

export SSH_AUTH_SOCK=/tmp/ssh-XXXXrXLX1d/agent.236885

...then ssh -T [email protected] succeeds (causes the Yubikey to flash, and connects successfully after I touch it), and ssh-add -L correctly reports the key is available.

Aside: during the process of writing out these steps, I discovered that if I don't touch the Yubikey after running wezterm connect talitha, allowing it to time out, the command then prints this on the client:

sign_and_send_pubkey: signing failed for ED25519-SK "/Users/soren/.ssh/id_ed25519_sk" from agent: agent refused operation
soren@talitha: Permission denied (publickey,password,keyboard-interactive).
2022-02-19T16:59:43.266Z ERROR wezterm_client::client  > Error while decoding response pdu: decoding a PDU: reading PDU length: EOF while reading leb128 encoded value
2022-02-19T16:59:43.266Z ERROR mux::connui             > while running ConnectionUI loop: recv_timeout: channel is empty and disconnected
2022-02-19T16:59:43.266Z ERROR wezterm_gui             > Please install the same version of wezterm on both the client and server! The server reported error 'Error while decoding response pdu: decoding a PDU: reading PDUlength: EOF while reading leb128 encoded value' while being asked for its version.  This likely means that the server is older than the client.
; terminating

This is reminiscent of the output that appears in the new window when I try to connect by right-clicking the + in the tab bar and selecting "attach domain talitha", instead of running wezterm connect talitha:

Connect to Proxy(["ssh", "-T", "-A", "talitha", "wezterm", "cli", "proxy"])
Connected!
Checking server version
Please install the same version of wezterm on both the client and server! The server reported error 'Error while decoding response pdu: decoding a PDU: readingPDU length: EOF while reading leb128 encoded value' while being asked for its version.  This likely means that the server is older than the client.

Failed: Please install the same version of wezterm on both the client and server! The server reported error 'Error while decoding response pdu: decoding a PDU:reading PDU length: EOF while reading leb128 encoded value' while being asked for its version.  This likely means that the server is older than the client.

Error during attach: Please install the same version of wezterm on both the client and server! The server reported error 'Error while decoding response pdu: decoding a PDU: reading PDU length: EOF while reading leb128 encoded value' while being asked for its version.  This likely means that the server is older than the client.

Configuration

On the client (rigel), this is relevant section of my config:

unix_domains = {
  {
    name = "talitha",
    proxy_command = { "ssh", "-T", "-A", "talitha", "wezterm", "cli", "proxy" },
  },
},

Expected Behavior

I expect wezterm connect talitha to open a new Wezterm window containing a prompt that behaves the same way as the one I get when I run ssh -A talitha - I should be able to perform SSH operations using the ssh-agent running on my local machine.

Logs

CTRL-SHIFT-L doesn't open anything on my macOS machine, it just seems to send CTRL-L to the terminal, clearing it.

Anything else?

I haven't tried the latest nightly build to see if it helps, because it's a bit of a pain - for me, that involves having to write a Nix overlay - but I'm happy to do that a little later today if you think it'd be a good idea.

I've tried adding this to rigel's ~/.ssh/config:

Host talitha
  AddKeysToAgent yes
  ForwardAgent yes

...and this to talitha's ~/.config/wezterm/wezterm.lua:

mux_env_remove = {
  "SSH_AUTH_SOCK",
  "SSH_CLIENT",
  "SSH_CONNECTION",
}

...but neither made a difference.

maddiemort avatar Feb 19 '22 17:02 maddiemort

Try this on the remote host:

mux_env_remove = {
  -- "SSH_AUTH_SOCK",  -- remove this line
  "SSH_CLIENT",
  "SSH_CONNECTION",
}

wez avatar Feb 23 '22 15:02 wez

Just tried that, it didn't make any difference - which is probably something to do with the fact that, like SSH_AUTH_SOCK, those other two variables are empty when I connect through wezterm connect talitha, whether or not the mux_env_remove section is included in the remote host's config.

When I connect through ssh -A talitha, they are both non-empty.

maddiemort avatar Feb 23 '22 16:02 maddiemort

OK, so what's happening here is:

  • When you first start the mux daemon, it discarded those environment variables. Subsequently spawned children of the multiplexer will have that same environment and not know how to talk to the ssh agent.
  • You could adjust mux_env_remove to preserve the environment, in which case agent forwarding would work for that initially spawned mux scenario
  • However, if you disconnect and later reconnect, that environment would be stale and non-functional.
  • It's not possible to inject environment variables into running processes, so there's no way to automatically update those shells when you reconnect

This issue is not specific to wezterm; it applies also to screen and tmux.

This gist: https://gist.github.com/martijnvermaat/8070533 has some suggestions on how to manage this sort of thing cooperatively between your shell and ssh configuration on the remote host.

wez avatar Feb 23 '22 16:02 wez

This gist: https://gist.github.com/martijnvermaat/8070533 has some suggestions on how to manage this sort of thing cooperatively between your shell and ssh configuration on the remote host.

I'm using this on tmux and it works fine. I basically run refresh whenever I start getting errors and it just works. Is there a way to relay environment variables to wezterm? Or at least to set an environment var at the startup of every pane?

pferreir avatar Jul 12 '22 09:07 pferreir

Anyway, this seems to work for me:

~/.zshrc

if [ -n "$SSH_CONNECTION" ]; then
  export SSH_AUTH_SOCK=$HOME/.ssh/ssh_auth_sock
fi

~/.ssh/rc

#!/bin/bash

if test "$SSH_AUTH_SOCK" ; then
    ln -sf $SSH_AUTH_SOCK ~/.ssh/ssh_auth_sock
fi

pferreir avatar Jul 12 '22 09:07 pferreir

Just thinking out loud. Could we make the mux daemon create its own SSH_AUTH_SOCK and expose that to the processes. This path would be stable and wouldn't have to be updated as long as the daemon exists. This socket would strictly be a proxy. When an ssh -A connection connects to the daemon, the daemon will take the new SSH_AUTH_SOCK provided by ssh and use it as the endpoint for the proxy socket.

https://www.libssh2.org/examples/ssh2_agent_forwarding.html has an example of setting up the agent forwarding channel.

ismell avatar Feb 23 '23 19:02 ismell

+1. Even with a plain unix domain (no proxy_command), I need to rerun eval $(ssh-agent -s) && ssh-add ... every time I open a new pane

brandonchinn178 avatar Aug 10 '23 21:08 brandonchinn178

I don't think this helps with the multiplexer scenario, but I want to note that #5345 enables agent forwarding in wezterms internal ssh implementation in main. I think that makes it more feasible to build an integrated solution for this, but I don't currently have time or plans to sit down and do this myself.

wez avatar May 08 '24 15:05 wez

as of 4af418fddd0ed2d9a8861007112006fc657ecbac, the mux server now automatically maintains SSH_AUTH_SOCK based on the most recently active multiplexer client. This works for sure with ssh domains, and will likely also work with unix domains if you're using the proxy_command hack shown in the thread above, although I have not personally tested that.

wez avatar May 09 '24 20:05 wez

This is wonderful, thanks a ton!!! I can confirm my session is correctly forwarded when doing wezterm connect!

carlesso avatar May 10 '24 18:05 carlesso