Win32-OpenSSH
Win32-OpenSSH copied to clipboard
Invoking `ssh` in a command substitution in a `ProxyCommand` hangs
Prerequisites
- [X] Write a descriptive title.
- [X] Make sure you are able to repro it on the latest version
- [X] Search the existing issues.
Steps to reproduce
I'm using a ProxyCommand that itself invokes SSH via command substitution in PowerShell, which can be simplified to the following configuration:
Host host
User <user-on-host>
Host test
User <user-on-host>
ProxyCommand powershell.exe -noprofile -command "ssh -x -W $(ssh -x -n host echo host):%p host"
where host is a Linux host I have access to, i.e. I can do ssh host. I invoke the following command:
ssh test
This command hangs.
The same configuration works in WSL (OpenSSH_8.9p1 Ubuntu-3, OpenSSL 3.0.2 15 Mar 2022), and connects me to host. EDIT 2024-12-01 An equivalent configuration also works in WSL, which does not involve PowerShell:
Host host
User <user-on-host>
Host test2
User <user-on-host>
ProxyCommand ssh -W $(ssh -x -n host echo host):%p host
Expected behavior
I get a shell on host.
Actual behavior
The command hangs. When I type ctrl-c, I get:
chan_shutdown_write: channel 0: close() failed for fd -1 [i3 o1]: Interrupted function call
Error details
PowerShell does not return an error.
Environment data
Name Value
---- -----
PSVersion 5.1.22621.4249
PSEdition Desktop
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
BuildVersion 10.0.22621.4249
CLRVersion 4.0.30319.42000
WSManStackVersion 3.0
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
Version
OpenSSH_for_Windows_9.8p1 Win32-OpenSSH-GitHub, LibreSSL 3.9.2
Visuals
No response
I'm trying to debug this in Visual Studio 2022 using the "Microsoft Child Process Debugging Power Tool 2022" extension.
I modified my ProxyCommand to the following:
Host test
ProxyCommand powershell.exe -noprofile -command "C:\Users\me\Code\openssh-portable\bin\x64\Debug\ssh.exe -x -W $(C:\Users\me\Code\openssh-portable\bin\x64\Debug\ssh.exe -x -n -vvv host echo host):%p host"
so that all SSH instances use my debug build of ssh.exe. After powershell.exe is launched, I get a second ssh.exe process which is the inner one, inside the command substitution (ssh -x -n host echo host). When I debug this process, I see the echo host command being sent to the server:
debug1: Sending command: echo host
debug2: channel 0: request exec confirm 1
debug3: send packet: type 98
debug3: client_repledge: enter
debug1: pledge: fork
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug3: send packet: type 80
debug2: channel 0: rcvd adjust 2097152
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
debug3: receive packet: type 82
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype [email protected] reply 0
debug2: channel 0: rcvd eow
debug2: chan_shutdown_read: channel 0: (i0 o1 sock -1 wfd 4 efd 6 [write])
debug2: channel 0: input open -> closed
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug3: channel 0: will not send data after close
debug3: send packet: type 80
debug3: channel 0: will not send data after close
debug2: channel 0: obuf empty
debug2: chan_shutdown_write: channel 0: (i3 o1 sock -1 wfd 5 efd 6 [write]
and I the server responds (I'm running the server with -D -e -o LogLevel=DEBUG3 to follow what is going on).
After that, I continue execution and nothing happens. If I then pause execution in VS, the call stack of the inner SSH is this:
[External code]
ssh.exe!wait_for_multiple_objects_enhanced(unsigned long nCount, void * const * lpHandles, unsigned long dwMilliseconds, int bAlertable) Ligne 96 C
ssh.exe!wait_for_any_event(void * * events, int num_events, unsigned long milli_seconds) Ligne 287 C
> ssh.exe!fileio_close(w32_io * pio) Ligne 1102 C
ssh.exe!w32_close(int fd) Ligne 642 C
ssh.exe!channel_close_fd(ssh * ssh, Channel * c, int * fdp) Ligne 552 C
ssh.exe!chan_shutdown_write(ssh * ssh, Channel * c) Ligne 391 C
ssh.exe!chan_obuf_empty(ssh * ssh, Channel * c) Ligne 162 C
ssh.exe!channel_pre_open(ssh * ssh, Channel * c) Ligne 1246 C
ssh.exe!channel_handler(ssh * ssh, int table, timespec * timeout) Ligne 2608 C
ssh.exe!channel_prepare_poll(ssh * ssh, w32_pollfd * * pfdp, unsigned int * npfd_allocp, unsigned int * npfd_activep, unsigned int npfd_reserved, timespec * timeout) Ligne 2795 C
ssh.exe!client_wait_until_can_do_something(ssh * ssh, w32_pollfd * * pfdp, unsigned int * npfd_allocp, unsigned int * npfd_activep, int channel_did_enqueue, int * sigsetp, int * conn_in_readyp, int * conn_out_readyp) Ligne 704 C
ssh.exe!client_loop(ssh * ssh, int have_pty, int escape_char_arg, int ssh2_chan_id) Ligne 1596 C
ssh.exe!ssh_session2(ssh * ssh, const ssh_conn_info * cinfo) Ligne 2328 C
ssh.exe!main(int ac, char * * av) Ligne 1809 C
ssh.exe!wmain(int argc, wchar_t * * wargv) Ligne 62 C
[External code]
and if I go to the frame highlighted with > above, I see this comment and code
/*
* we report to POSIX app that an async write has completed as soon its
* copied to internal buffer. The app may subsequently try to close the
* fd thinking everything is written. IF the Windows handle is closed
* now, the pipe/file io write operation may terminate prematurely.
* To compensate for the discrepency
* wait here until async write has completed.
* If you see any process waiting here indefinitely - its because no one
* is draining from other end of the pipe/file. This is an unfortunate
* consequence that should otherwise have very little impact on practical
* scenarios.
*/
while (pio->write_details.pending)
if (0 != wait_for_any_event(NULL, 0, INFINITE))
return -1;
which was added in PowerShell/openssh-portable@4edff78b9 (Fixed issue in POSIX layer that could truncate write streams (PowerShell/openssh-portable#235), 2017-11-06) by @manojampalam.
This seems coherent with the chan_shutdown_write: error I get if I ctrl-c, since this function is a few frames above in the call stack...
Another data point, I tried this alternative approach:
Host test2
ProxyCommand powershell.exe -noprofile -command "ssh -x -n host echo host >destination; ssh -vvv -W $(cat destination):%p host"
and the behaviour is the same.
I get no output from -vvv in the second invocation, it seems PowerShell never reaches it.
Please try building and using private binaries from this PR. (anmenaga:cleanup_hang_fix branch)
Please try building and using private https://github.com/PowerShell/openssh-portable/pull/763. (anmenaga:cleanup_hang_fix branch)
Thank you for the suggestion ! I tried building from that branch, it did not change the behaviour.
I also tried with different values of environment variable OPENSSH_STDIO_MODE, following https://github.com/PowerShell/openssh-portable/pull/759. It did not change the behaviour either.
Can you try applying the same timeout from https://github.com/PowerShell/openssh-portable/pull/763/files to where this hang is occurring in https://github.com/PowerShell/openssh-portable/blob/b89ee6aa372f03fd10ae13e5af23b44be4456c79/contrib/win32/win32compat/fileio.c#L1089-L1103?
Would need to change INFINITE to 1000 to timeout after a second instead of waiting forever.
Thank you for the suggestion. I tried this diff:
diff --git a/contrib/win32/win32compat/fileio.c b/contrib/win32/win32compat/fileio.c
index b7e4e6338..032fbd940 100644
--- a/contrib/win32/win32compat/fileio.c
+++ b/contrib/win32/win32compat/fileio.c
@@ -1099,7 +1099,7 @@ fileio_close(struct w32_io* pio)
* scenarios.
*/
while (pio->write_details.pending)
- if (0 != wait_for_any_event(NULL, 0, INFINITE))
+ if (0 != wait_for_any_event(NULL, 0, 1000))
return -1;
CancelIo(WINHANDLE(pio));
but the behaviour is the same.