openssh-portable icon indicating copy to clipboard operation
openssh-portable copied to clipboard

Add support for AF_UNIX

Open bilby91 opened this issue 1 year ago • 29 comments

PR Summary

This pull requests adds support for AF_UNIX sockets.

The implementation modifies contrib/win32/win32compat/socketio.c socketio_acceptEx implementation so that it can handle sa_family == AF_UNIX. contrib/win32/win32compat/w32fd.c has also been modified in order to remove the path that uses fileio_afunix_socket.

The unix socket name is conditionally using C:\\tmp\\ssh-XXXXXXXXXX instead of /tmp/ssh-XXXXXXXXXX since it doesn't work correctly when using ssh in the windows host. I'm not sure if /tmp is the best directory to host the sock in Windows but I left it that way since unix users will expect the agent socks to live there. The caveat here is that you need to create the /tmp folder in Windows in order to properly work.

One known gap that the implementation is missing is conditionally using AF_UNIX. I know that a subset of Windows version support it but I'm not sure what is the best way to handle this. If I can get any guidance in terms of how to approach it I can test it and modify it.

PR Context

The motivation around working on this patch was to get support for SSH Agent Forwarding. With this patch applied, ssh-add -l will list the keys from the local system. I've also tested using git to clone repositories and it's successfully able to use the local system private keys through the agent.

I've also tested that WSL can successfully use the SSH Agent sock as long as the SSH_AUTH_SOCK is defined in the bash session.

This pull request could potentially solve the following issues:

https://github.com/PowerShell/Win32-OpenSSH/issues/1461 https://github.com/PowerShell/Win32-OpenSSH/issues/1761 https://github.com/PowerShell/Win32-OpenSSH/issues/1462

bilby91 avatar Apr 12 '23 17:04 bilby91

@tgauth Would it be fine if we guard the different code paths with #ifdef HAVE_AFUNIX_H ?

bilby91 avatar Apr 13 '23 14:04 bilby91

@tgauth Would it be fine if we guard the different code paths with #ifdef HAVE_AFUNIX_H ?

I don't see why not. If you haven't seen it yet, can add it to: https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/openssh/config.h.vs

Thanks for working on this!

tgauth avatar Apr 13 '23 14:04 tgauth

@tgauth Thanks for the quick feedback! I added the HAVE_AFUNIX_H guard. Visual Studio doesn't seem to be picking up the #define statement from config.h.vs, is there anything else I need to run in order to make the change effective when compiling ? Want to make sure stuff is working properly with my tests.

bilby91 avatar Apr 13 '23 14:04 bilby91

@tgauth Thanks for the quick feedback! I added the HAVE_AFUNIX_H guard. Visual Studio doesn't seem to be picking up the #define statement from config.h.vs, is there anything else I need to run in order to make the change effective when compiling ? Want to make sure stuff is working properly with my tests.

There's a config.h file in the root folder that gets created from config.h.vs during compilation of the config.vcxproj - can you confirm that's picking up the #define statement?

tgauth avatar Apr 13 '23 15:04 tgauth

@tgauth Yes, I can confirm that config.h in the root got updated.

/* Use PIPES instead of a socketpair() */
#define USE_PIPES 1

/* define 1 if afunix.h is available */
#define HAVE_AFUNIX_H 1

Still, it doesn't seem to be compiling with HAVE_AFUNIX_H 1.

bilby91 avatar Apr 13 '23 15:04 bilby91

@tgauth Yes, I can confirm that config.h in the root got updated.

/* Use PIPES instead of a socketpair() */
#define USE_PIPES 1

/* define 1 if afunix.h is available */
#define HAVE_AFUNIX_H 1

Still, it doesn't seem to be compiling with HAVE_AFUNIX_H 1.

oh, my bad - seems like config.h needs to be explicitly included in posix_compat where HAVE_AFUNIX_H is used. Try putting #include "..\..\..\config.h" in both socketio.c and w32fd.c, or try defining it in w32fd.h since that's imported by both?

tgauth avatar Apr 13 '23 16:04 tgauth

@tgauth That did the trick, thanks! I added the include statement in both files because doing so in w32fd.h had issues with signal.c.

I'm testing now a different approach for temp folder where the unix socket will live. I'm changing the implementation so that it uses fileapi.h GetTempPath to replace the /tmp. Do you think this is a better location ?

bilby91 avatar Apr 13 '23 17:04 bilby91

I'm testing now a different approach for temp folder where the unix socket will live. I'm changing the implementation so that it uses fileapi.h GetTempPath to replace the /tmp. Do you think this is a better location ?

yep, that makes sense to me!

tgauth avatar Apr 13 '23 17:04 tgauth

@tgauth I think I have addressed the different issues now. Take a look at the code that generates the temporal directory.

Last commit was pushed using this feature 😎

bilby91 avatar Apr 13 '23 20:04 bilby91

@tgauth Was looking at the failing tests. They generally timeout ? Some tests seem to have failed but they look unrelated (at first glance). I wonder if having #define HAVE_AFUNIX_H 1 by default is causing issues in the ADO environment.

Any insight ?

bilby91 avatar Apr 13 '23 22:04 bilby91

I think I know what is going on. Will try to fix it over the weekend.

bilby91 avatar Apr 14 '23 12:04 bilby91

@tgauth Tests are passing now :)

I had to change a few other calls to socket() due to the ssh-agent's named pipes implementation. I don't love how it's done but the path for the named pipe is fixed so comparing should be safe to decide (based on my understanding and local testing).

I think that a subsequent change could replace the named pipes usage in agent.c and use a native AF_UNIX socket to simplify the implementation.

I'm moved AGENT_PIPE_ID to config.h because I was planning to use it with strcmp but later realized that it's a wchar_t instead of a char * so I would need to convert it before comparing. I can do the transformation or compare with the hardcoded version. What do you think ?

Anything else I might be missing ?

Thanks so much looking at the changes!

bilby91 avatar Apr 15 '23 00:04 bilby91

I know that a subset of Windows version support it but I'm not sure what is the best way to handle this. If I can get any guidance in terms of how to approach it I can test it and modify it.

Based on https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/ support started with Windows 10. We have a similar check already defined that I think can also be utilized here: https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/misc_internal.h#L15

Anything else I might be missing ?

Ideally, would like to have some test coverage for agent forwarding! There are some upstream agent tests that the CI currently skips - https://github.com/PowerShell/openssh-portable/blob/latestw_all/regress/agent.sh, https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/openssh/bash_tests_iterator.ps1#L189. If the tests applicable to agent forwarding could be modified as necessary for Windows and not skipped, that would be great! A single bash test can be run with the following command: .\bash_tests_iterator.ps1 -OpenSSHBinPath "C:\openssh-portable\bin\x64\Release\" -BashTestsPath "c:\openssh-portable\regress" -shellpath "c:\cygwin64\bin\sh.exe" -TestFilePath "C:\openssh-portable\regress\agent.sh"

tgauth avatar Apr 17 '23 20:04 tgauth

I'm moved AGENT_PIPE_ID to config.h because I was planning to use it with strcmp but later realized that it's a wchar_t instead of a char * so I would need to convert it before comparing. I can do the transformation or compare with the hardcoded version. What do you think ?

For maintainability, I'm in favor of doing the transformation so that AGENT_PIPE_ID is the singular definition of the pipe name, as opposed to hardcoding it in a second place.

tgauth avatar Apr 17 '23 21:04 tgauth

@tgauth Thanks for the feedback!

I'll take a look at the tests and doing the transformation for AGENT_PIPE_ID!

In regards to using IsWindows8OrGreater/IsWin7OrLess, you are thinking of replacing the HAVE_AFUNIX_H with calls to those functions at runtime ? Not sure how Microsoft build process works but HAVE_AFUNIX_H could be defined only for Win7 >.

Just want to make sure I fully understand how we want to protect the paths.

bilby91 avatar Apr 17 '23 21:04 bilby91

@tgauth I was able to include a fraction of the agent.sh tests! Had to fix a bunch of stuff in the test-exec.sh, which was probably preventing them from running originally.

I had to continue excluding some of the tests for two reasons:

  1. Some tests try to use different agents but Windows only supports one of them. I think this limitation can go away if we implement the agent with AF_UNIX and follow the same strategy as OpenSSH-portable implementation.
  2. I wasn't able to get certificate-based key authentication to work. I saw some other tests skipped as well so I'm assuming this is currently not working.

Having said that, with these changes, we have a little bit more coverage while keeping all the other tests passing.

I also made the change to compare with AGENT_PIPE_ID. I think the last piece remaining would be to decide what to do with HAVE_AFUNIX_H and IsWindows8OrGreater. I think we could keep it as it is with HAVE_AFUNIX_H and decide at build time if the feature needs to be included or not, or, I can change the places where I used HAVE_AFUNIX_H and use IsWindows8OrGreater instead. Let me know what you think!

bilby91 avatar Apr 19 '23 13:04 bilby91

Ping @tgauth

bilby91 avatar Apr 21 '23 12:04 bilby91

@bilby91, thank you for working on this!

The changes look good from an initial review, but we need some additional time to do a comprehensive review, since this would be a new feature for Win32-OpenSSH.

For example, upstream has made some agent restrictions that have not yet been implemented for Windows that would be required to achieve parity.

tgauth avatar Apr 24 '23 18:04 tgauth

@tgauth Thanks for the feedback!

I totally get that this kind of change needs an important level of review before shipping to Windows.

Is there anything on my end that you think I can help with in the meantime?

bilby91 avatar Apr 24 '23 18:04 bilby91

Is there anything on my end that you think I can help with in the meantime?

Not that I can think of right now, but will keep you posted - thanks for your patience!

tgauth avatar May 02 '23 17:05 tgauth

@tgauth was wondering if you had any updates about this topic!

Thanks!

bilby91 avatar Sep 30 '23 01:09 bilby91

This is delayed due to focusing on more simple PRs into the 9.4 release as well as an in depth security review of the behavior change. Because this is a new feature, not a bug fix, it requires further scrutiny.

maertendMSFT avatar Oct 02 '23 16:10 maertendMSFT

@maertendMSFT I understand. Thanks for the feedback!

bilby91 avatar Oct 02 '23 18:10 bilby91

@bilby91 Do you know if this would also work for mux and ControlPath or will it require more work to get it working?

arixmkii avatar Dec 05 '23 21:12 arixmkii

@arixmkii To be honest I don't know.

Any chance you have local development environment where you could test it ? If not I might be able to help over the weekend.

bilby91 avatar Dec 05 '23 22:12 bilby91

@bilby91 I rebased your changes to 9.4.0.0 and built locally. Unfortunately it didn't work, but the error is suspicious, I don't have logs at hand, but it was about socket naming. Might be an error at command parsing level. I will debug it at the end of this week to figure out more details.

arixmkii avatar Dec 06 '23 11:12 arixmkii

Answering my question. No Mux will not work as everything mux is now covered by no_ops stubs. I will try to investigate the possible subset of Mux, which could be achieved with AF_UNIX additions from this PR.

arixmkii avatar Jan 09 '24 20:01 arixmkii

@bilby91, I just noticed your PR. I am just a bit confused as it looks like it might contradict parts of what I have researched over the weekend. (See also PowerShell/Win32-OpenSSH#1024 (comment).)

I also looked over your PR and made a few comments. However, I was missing some context and could therefore not really understand all of it. (Also note that I am not affiliated with Microsoft.)

What I don't understand at your PR is what has agent forwarding to do with whether SSH on Windows uses unix domain sockets instead of named pipes? This seems like two unrelated topics for me. I assume the changes in session.c apply to sshd? Is it possible that you could also have used named pipes here and then skipped all the other changes regarding unix domain sockets?

I've also tested that WSL can successfully use the SSH Agent sock as long as the SSH_AUTH_SOCK is defined in the bash session.

Are you talking about WSL1? Because I thought that WSL2 does not support interoperability with unix domain sockets (yet).

@JojOatXGME Thanks for taking a look at the code!

It has been almost an year since I worked on this change but I'm pretty positive I tried using named pipes instead of the AF_UNIX sockets and it was not working. Could have been lack of understanding on my side so it's highly likely than an implementation based on them is possible.

Regarding WSL1 vs WSL2, I think I was using WSL 2 in the machine I used for testing. Unfortunately I don't have that machine any more but I can look into double checking that in a different one potentially.

Are you able to test yourself by any chance ?

bilby91 avatar Feb 25 '24 20:02 bilby91

Are you able to test yourself by any chance ?

I have to see. I haven't worked on the project before. I actually did not work on any C-project for quite some time. I would have to set up some dev environment to build the project. Anyway, I will try to play with it within the next two weeks.

PS: Assuming we could use named pipes in session.c for the agent forwarding, we might be able to split this PR into three smaller changes: 1. agent forwarding, 2. client side support for native unix domain socket, 3. option to use native unix domain sockets for ssh-agent and sshd (agent forwarding). Maybe splitting the topic would simplify the security review? The biggest security risk or uncertainty comes from 3. in my opinion. If I find time to play with it, this is what I will properly try to check.

JojOatXGME avatar Feb 26 '24 00:02 JojOatXGME