Win32-OpenSSH icon indicating copy to clipboard operation
Win32-OpenSSH copied to clipboard

Invoking `ssh` in a command substitution in a `ProxyCommand` hangs

Open phil-blain opened this issue 1 year ago • 6 comments
trafficstars

Prerequisites

  • [X] Write a descriptive title.
  • [X] Make sure you are able to repro it on the latest version
  • [X] Search the existing issues.

Steps to reproduce

I'm using a ProxyCommand that itself invokes SSH via command substitution in PowerShell, which can be simplified to the following configuration:

Host host
    User <user-on-host>
Host test
    User <user-on-host>
    ProxyCommand powershell.exe -noprofile -command "ssh -x -W $(ssh -x -n host echo host):%p host"

where host is a Linux host I have access to, i.e. I can do ssh host. I invoke the following command:

ssh test

This command hangs.

The same configuration works in WSL (OpenSSH_8.9p1 Ubuntu-3, OpenSSL 3.0.2 15 Mar 2022), and connects me to host. EDIT 2024-12-01 An equivalent configuration also works in WSL, which does not involve PowerShell:

Host host
    User <user-on-host>
Host test2
    User <user-on-host>
    ProxyCommand ssh -W $(ssh -x -n host echo host):%p host

Expected behavior

I get a shell on host.

Actual behavior

The command hangs. When I type ctrl-c, I get:

chan_shutdown_write: channel 0: close() failed for fd -1 [i3 o1]: Interrupted function call

Error details

PowerShell does not return an error.

Environment data

Name                           Value                                                                                                                                                                            
----                           -----                                                                                                                                                                            
PSVersion                      5.1.22621.4249                                                                                                                                                                   
PSEdition                      Desktop                                                                                                                                                                          
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                                                                                                          
BuildVersion                   10.0.22621.4249                                                                                                                                                                  
CLRVersion                     4.0.30319.42000                                                                                                                                                                  
WSManStackVersion              3.0                                                                                                                                                                              
PSRemotingProtocolVersion      2.3                                                                                                                                                                              
SerializationVersion           1.1.0.1

Version

OpenSSH_for_Windows_9.8p1 Win32-OpenSSH-GitHub, LibreSSL 3.9.2

Visuals

No response

phil-blain avatar Nov 22 '24 18:11 phil-blain

I'm trying to debug this in Visual Studio 2022 using the "Microsoft Child Process Debugging Power Tool 2022" extension.

I modified my ProxyCommand to the following:

Host test
    ProxyCommand powershell.exe -noprofile -command "C:\Users\me\Code\openssh-portable\bin\x64\Debug\ssh.exe -x -W $(C:\Users\me\Code\openssh-portable\bin\x64\Debug\ssh.exe -x -n -vvv host echo host):%p host"

so that all SSH instances use my debug build of ssh.exe. After powershell.exe is launched, I get a second ssh.exe process which is the inner one, inside the command substitution (ssh -x -n host echo host). When I debug this process, I see the echo host command being sent to the server:

debug1: Sending command: echo host
debug2: channel 0: request exec confirm 1
debug3: send packet: type 98
debug3: client_repledge: enter
debug1: pledge: fork
debug2: channel_input_open_confirmation: channel 0: callback done                                                       
debug2: channel 0: open confirm rwindow 0 rmax 32768                                                                    
debug3: send packet: type 80                                                                                            
debug2: channel 0: rcvd adjust 2097152                                                                                  
debug3: receive packet: type 99                                                                                         
debug2: channel_input_status_confirm: type 99 id 0                                                                      
debug2: exec request accepted on channel 0                                                                              
debug3: receive packet: type 82                                                                                         
debug3: receive packet: type 96                                                                                         
debug2: channel 0: rcvd eof                                                                                             
debug2: channel 0: output open -> drain                                                                                 
debug3: receive packet: type 98                                                                                         
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0                                                   
debug3: receive packet: type 98                                                                                         
debug1: client_input_channel_req: channel 0 rtype [email protected] reply 0                                               
debug2: channel 0: rcvd eow                                                                                             
debug2: chan_shutdown_read: channel 0: (i0 o1 sock -1 wfd 4 efd 6 [write])                                              
debug2: channel 0: input open -> closed                                                                                 
debug3: receive packet: type 97                                                                                         
debug2: channel 0: rcvd close                                                                                           
debug3: channel 0: will not send data after close                                                                       
debug3: send packet: type 80                                                                                            
debug3: channel 0: will not send data after close                                                                       
debug2: channel 0: obuf empty                                                                                           
debug2: chan_shutdown_write: channel 0: (i3 o1 sock -1 wfd 5 efd 6 [write]

and I the server responds (I'm running the server with -D -e -o LogLevel=DEBUG3 to follow what is going on).

After that, I continue execution and nothing happens. If I then pause execution in VS, the call stack of the inner SSH is this:

 	[External code]	
 	ssh.exe!wait_for_multiple_objects_enhanced(unsigned long nCount, void * const * lpHandles, unsigned long dwMilliseconds, int bAlertable) Ligne 96	C
 	ssh.exe!wait_for_any_event(void * * events, int num_events, unsigned long milli_seconds) Ligne 287	C
>	ssh.exe!fileio_close(w32_io * pio) Ligne 1102	C
 	ssh.exe!w32_close(int fd) Ligne 642	C
 	ssh.exe!channel_close_fd(ssh * ssh, Channel * c, int * fdp) Ligne 552	C
 	ssh.exe!chan_shutdown_write(ssh * ssh, Channel * c) Ligne 391	C
 	ssh.exe!chan_obuf_empty(ssh * ssh, Channel * c) Ligne 162	C
 	ssh.exe!channel_pre_open(ssh * ssh, Channel * c) Ligne 1246	C
 	ssh.exe!channel_handler(ssh * ssh, int table, timespec * timeout) Ligne 2608	C
 	ssh.exe!channel_prepare_poll(ssh * ssh, w32_pollfd * * pfdp, unsigned int * npfd_allocp, unsigned int * npfd_activep, unsigned int npfd_reserved, timespec * timeout) Ligne 2795	C
 	ssh.exe!client_wait_until_can_do_something(ssh * ssh, w32_pollfd * * pfdp, unsigned int * npfd_allocp, unsigned int * npfd_activep, int channel_did_enqueue, int * sigsetp, int * conn_in_readyp, int * conn_out_readyp) Ligne 704	C
 	ssh.exe!client_loop(ssh * ssh, int have_pty, int escape_char_arg, int ssh2_chan_id) Ligne 1596	C
 	ssh.exe!ssh_session2(ssh * ssh, const ssh_conn_info * cinfo) Ligne 2328	C
 	ssh.exe!main(int ac, char * * av) Ligne 1809	C
 	ssh.exe!wmain(int argc, wchar_t * * wargv) Ligne 62	C
 	[External code]	

and if I go to the frame highlighted with > above, I see this comment and code

	/*
	* we report to POSIX app that an async write has completed as soon its
	* copied to internal buffer. The app may subsequently try to close the
	* fd thinking everything is written. IF the Windows handle is closed
	* now, the pipe/file io write operation may terminate prematurely.
	* To compensate for the discrepency
	* wait here until async write has completed.
	* If you see any process waiting here indefinitely - its because no one
	* is draining from other end of the pipe/file. This is an unfortunate
	* consequence that should otherwise have very little impact on practical
	* scenarios.
	*/
	while (pio->write_details.pending)
		if (0 != wait_for_any_event(NULL, 0, INFINITE))
			return -1;

which was added in PowerShell/openssh-portable@4edff78b9 (Fixed issue in POSIX layer that could truncate write streams (PowerShell/openssh-portable#235), 2017-11-06) by @manojampalam.

This seems coherent with the chan_shutdown_write: error I get if I ctrl-c, since this function is a few frames above in the call stack...

phil-blain avatar Dec 01 '24 18:12 phil-blain

Another data point, I tried this alternative approach:

Host test2
    ProxyCommand powershell.exe -noprofile -command "ssh -x -n host echo host >destination; ssh -vvv -W $(cat destination):%p host"

and the behaviour is the same.

I get no output from -vvv in the second invocation, it seems PowerShell never reaches it.

phil-blain avatar Dec 01 '24 20:12 phil-blain

Please try building and using private binaries from this PR. (anmenaga:cleanup_hang_fix branch)

anmenaga avatar Dec 10 '24 21:12 anmenaga

Please try building and using private https://github.com/PowerShell/openssh-portable/pull/763. (anmenaga:cleanup_hang_fix branch)

Thank you for the suggestion ! I tried building from that branch, it did not change the behaviour.

I also tried with different values of environment variable OPENSSH_STDIO_MODE, following https://github.com/PowerShell/openssh-portable/pull/759. It did not change the behaviour either.

phil-blain avatar Dec 19 '24 17:12 phil-blain

Can you try applying the same timeout from https://github.com/PowerShell/openssh-portable/pull/763/files to where this hang is occurring in https://github.com/PowerShell/openssh-portable/blob/b89ee6aa372f03fd10ae13e5af23b44be4456c79/contrib/win32/win32compat/fileio.c#L1089-L1103? Would need to change INFINITE to 1000 to timeout after a second instead of waiting forever.

tgauth avatar Dec 19 '24 18:12 tgauth

Thank you for the suggestion. I tried this diff:

diff --git a/contrib/win32/win32compat/fileio.c b/contrib/win32/win32compat/fileio.c
index b7e4e6338..032fbd940 100644
--- a/contrib/win32/win32compat/fileio.c
+++ b/contrib/win32/win32compat/fileio.c
@@ -1099,7 +1099,7 @@ fileio_close(struct w32_io* pio)
 	* scenarios.
 	*/
 	while (pio->write_details.pending)
-		if (0 != wait_for_any_event(NULL, 0, INFINITE))
+		if (0 != wait_for_any_event(NULL, 0, 1000))
 			return -1;
 
 	CancelIo(WINHANDLE(pio));

but the behaviour is the same.

phil-blain avatar Jan 08 '25 21:01 phil-blain