unit icon indicating copy to clipboard operation
unit copied to clipboard

Socket: Ensured the control socket & pid file directories exist.

Open ac000 opened this issue 2 years ago • 32 comments

@alejandro-colomar reported an issue on GitHub whereby Unit would fail to start due to not being able to create the control socket (a Unix Domain Socket)

2022/08/05 20:12:22 [alert] 21613#21613 bind(6, unix:/opt/local/unit/var/run/unit/control.unit.sock.tmp) failed (2: No such file or directory)

This could happen if the control socket was set to a directory that doesn't exist. A common place to put the control socket would be under /run/unit, and while /run will exist, /run/unit may well not (/run is/should be cleared on each boot).

The pid file would also generally go under /run/unit, though this is created after the control socket, however it could go someplace else so we should also ensure its directory exists.

This commit ensures that the full directory paths to where the control socket and pid file will be located exists before trying to create them.

Closes: https://github.com/nginx/unit/issues/742 Reported-by: Alejandro Colomar [email protected] Reviewed-by: Alejandro Colomar [email protected] Tested-by: Alejandro Colomar [email protected] Signed-off-by: Andrew Clayton [email protected]

ac000 avatar Aug 15 '22 13:08 ac000

I've included a bunch of tags at the end this commit to demonstrate what they would look like, they can be removed before committing...

The tags serve several functions, including

  1. Signed-off-by is the main one and says that the author (and the person(s) who handle the patch on its way to the repotitory) have the right to submit the patch under the terms of the projects license.

  2. Give credit to contributors who review code, report bugs, test patches etc.

  3. Provide an audit trail of how a patch made it into the project.

ac000 avatar Aug 15 '22 13:08 ac000

In my feelings, in the function nxt_socket_bind(), there are a few places that are worth reworking.

#if NXT_HAVE_UNIX_DOMAIN
    if (sa->u.sockaddr.sa_family == AF_UNIX) {
        char  *ptr;
        char  *socket_dir;

        socket_dir = strdup(sa->u.sockaddr_un.sun_path);
        if (nxt_slow_path(socket_dir == NULL)) {
            return NXT_ERROR;
        }

        ptr = strrchr(socket_dir, '/');
        *ptr = '\0';

        nxt_fs_mkdir_all((const u_char *) socket_dir, 0777);

        nxt_free(socket_dir);
    }
#endif

  1. We usually use wrapper functions instead of C native functions if possible, such as strchr() to nxt_strchr().
  2. It's possible that calling nxt_fs_mkdir_all() failed.
  3. What about abstract unix socket?
  4. This will make nxt_socket_bind() dependent on nxt_fs_mkdir_all(). Or say that making socket dependent on file. It seems it's unreasonable. And users will be able to create directories by reconfiguration.

hongzhidao avatar Aug 15 '22 14:08 hongzhidao

In my feelings, in the function nxt_socket_bind(), there are a few places that are worth reworking.

#if NXT_HAVE_UNIX_DOMAIN
    if (sa->u.sockaddr.sa_family == AF_UNIX) {
        char  *ptr;
        char  *socket_dir;

        socket_dir = strdup(sa->u.sockaddr_un.sun_path);
        if (nxt_slow_path(socket_dir == NULL)) {
            return NXT_ERROR;
        }

        ptr = strrchr(socket_dir, '/');
        *ptr = '\0';

        nxt_fs_mkdir_all((const u_char *) socket_dir, 0777);

        nxt_free(socket_dir);
    }
#endif
1. We usually use wrapper functions instead of C native functions if possible, such as `strchr()` to `nxt_strchr()`.

We use the wrappers when we use nxt_str_t or u_char *, but in this case we're using "native" C strings (NUL-terminated char *), so it is simpler to use the native libc string functions.

2. It's possible that calling `nxt_fs_mkdir_all()` failed.

True, although if it fails, the next bind(2) is necessarily going to fail. We can safely omit the error check, I think.

But yeah, it might be nice adding it.

3. What about abstract unix socket?

I don't remember, but I think we don't support abstract unix sockets in the control socket. Do we?

4. This will make `nxt_socket_bind()` dependent on `nxt_fs_mkdir_all()`. Or say that making `socket` dependent on `file`. It seems it's unreasonable.  And users will be able to create directories by reconfiguration.

I don't fully understand this. Creating a socket in /foo/bar/sock necessarily requires that we can create the /foo/bar/ directory. Otherwise the socket creation will fail.

Yes, "users" will be able to create directories by reconfiguring, but that's why we strongly recommend that the control socket is only r/w by root. Only the root user should be able to reconfigure, and therefore there won't be any privilege issues.

alejandro-colomar avatar Aug 15 '22 14:08 alejandro-colomar

  1. This will make nxt_socket_bind() dependent on nxt_fs_mkdir_all(). Or say that making socket dependent on file. It seems it's unreasonable. And users will be able to create directories by reconfiguration. I don't fully understand this.

I mean every function should only do its job as what the function name is. The nxt_socket_bind() should only do socket bind, but not create directories.

hongzhidao avatar Aug 15 '22 14:08 hongzhidao

  1. It's possible that calling nxt_fs_mkdir_all() failed. True, although if it fails, the next bind(2) is necessarily going to fail. We can safely omit the error check, I think.

That nxt_socket_bind() will depend on the outside again. I prefer to make functions individual and reliable.

hongzhidao avatar Aug 15 '22 14:08 hongzhidao

  1. This will make nxt_socket_bind() dependent on nxt_fs_mkdir_all(). Or say that making socket dependent on file. It seems it's unreasonable. And users will be able to create directories by reconfiguration. I don't fully understand this.

I mean every function should only do its job as what the function name is. The nxt_socket_bind() should only do socket bind, but not create directories.

Then it might make sense to add a separate function that creates the dir and is called always before nxt_socket_bind(). But yes, they need to be called together, so I'd just make it depend on mkdir().

alejandro-colomar avatar Aug 15 '22 14:08 alejandro-colomar

  1. What about abstract unix socket? I don't remember, but I think we don't support abstract unix sockets in the control socket. Do we?

nxt_socket_bind() is a generic function, We can't assume that someone won't call it.

hongzhidao avatar Aug 15 '22 14:08 hongzhidao

What bind does is bind a socket file descriptor to a filesystem path. That filesystem path includes a file and its parent directory. I don't think creating a file is too different from creating a full path (dir + file). So I don't think it is really a different operation.

alejandro-colomar avatar Aug 15 '22 14:08 alejandro-colomar

Just to wonder if DevOps can do the thing of creating directories by themself?

hongzhidao avatar Aug 15 '22 14:08 hongzhidao

Just to wonder if DevOps can do the thing of creating directories by themself?

Do you mean manually create the directory?

It's not reasonable, because at reboot, the OS destroys everything under /run, so it would require that the directory is manually recreated at every reboot.

alejandro-colomar avatar Aug 15 '22 14:08 alejandro-colomar

What bind does is bind a socket file descriptor to a filesystem path. That filesystem path includes a file and its parent directory. I don't think creating a file is too different from creating a full path (dir + file). So I don't think it is really a different operation.

What if a system administrator only wants the socket to run in a specific directory?

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

What bind does is bind a socket file descriptor to a filesystem path. That filesystem path includes a file and its parent directory. I don't think creating a file is too different from creating a full path (dir + file). So I don't think it is really a different operation.

What if a system administrator only wants the socket to run in a specific directory?

Sockets should be created under some subdirectory of /run. The sysadmin should specify the specific directory that s/he wants in the unit config.

alejandro-colomar avatar Aug 15 '22 15:08 alejandro-colomar

If the sysadmin wants to limit the ability of unitd to create certain directories, it can limit the permissions of the user that will run it, so that it's not root. If unitd is run under root, it will have root privileges. So, normal filesystem permission situation. Nothing dangerous, as I see it.

alejandro-colomar avatar Aug 15 '22 15:08 alejandro-colomar

From a function design point of view, why is nxt_socket_bind() able to create any directories if possible? It should only do socket bind as its function name.

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

  1. What about abstract unix socket?

I don't remember, but I think we don't support abstract unix sockets in the control socket. Do we?

No, I'm pretty sure it doesn't support being an abstract socket, it uses a different code path to the listeners.

Actually, let me double check that...

ac000 avatar Aug 15 '22 15:08 ac000

Actually, let me double check that...

It's better to make sure if it's reasonable to create directories in nxt_socket_bind().

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

I found some information I didn't know before. https://github.com/nginx/unit/issues/742

alx@asus5775:~/src/nginx/unit$ sudo /opt/local/unit/sbin/unitd 
2022/08/05 20:12:22 [alert] 21613#21613 bind(6, unix:/opt/local/unit/var/run/unit/control.unit.sock.tmp) failed (2: No such file or directory)
unitd fails to create the socket (because it assumes the parent dir exists always), and exits.

unitd should create the parent directory previous to attempting to bind(2) the socket.
  1. unitd fails to create the socket (because it assumes the parent dir exists always), and exits.

I think that's a reasonable limit, unit is not responsible for managing related unix socket directory, it only works with UNIX socket file.

  1. unitd should create the parent directory previous to attempting to bind(2) the socket.

Just to think that if it's a reasonable requirement to help users create any socket directory internally?

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

I found some information I didn't know before. #742

alx@asus5775:~/src/nginx/unit$ sudo /opt/local/unit/sbin/unitd 
2022/08/05 20:12:22 [alert] 21613#21613 bind(6, unix:/opt/local/unit/var/run/unit/control.unit.sock.tmp) failed (2: No such file or directory)
unitd fails to create the socket (because it assumes the parent dir exists always), and exits.

unitd should create the parent directory previous to attempting to bind(2) the socket.
1. > unitd fails to create the socket (because it assumes the parent dir exists always), and exits.

I think that's a reasonable limit, unit is not responsible for managing related unix socket directory, it only works with UNIX socket file.

2. > unitd should create the parent directory previous to attempting to bind(2) the socket.

Just to think that if it's a reasonable requirement to help users create any socket directory internally?

We need to create the directories internally. Unix sockets are represented by files in the filesystem, and of course those files live in a directory. That directory will normally not exist, and it's impossible to create them permanently if done correctly (/run), so we can't ask users to create those directories themselves.

If we do it within nxt_socket_bind() or not, can be discussed, but we need to do it somewhere. I think it makes sense to do it within the bind function, because it's very related to bind (bind creates a file, so it isn't crazy to create also the parent dir). Also, we always need to run mkdir before binding a unix socket, so it makes sense to do it in one call, instead of writing more code.

alejandro-colomar avatar Aug 15 '22 15:08 alejandro-colomar

Actually, let me double check that...

So, it looks OK

$ strace -f -e mkdir,mkdirat,bind /opt/unit/sbin/unitd --no-daemon
2022/08/15 16:24:18 [warn] 12285#12285 Unit is running unprivileged, then it cannot use arbitrary user and group.
mkdir("/opt/unit/state/certs/", 0700)   = -1 EEXIST (File exists)
mkdir("/opt", 0777)                     = -1 EEXIST (File exists)
mkdir("/opt/unit", 0777)                = -1 EEXIST (File exists)
bind(6, {sa_family=AF_UNIX, sun_path="/opt/unit/control.unit.sock.tmp"}, 34) = 0
2022/08/15 16:24:18 [info] 12285#12285 unit 1.28.0 started
mkdir("/opt", 0777)                     = -1 EEXIST (File exists)
mkdir("/opt/unit", 0777)                = -1 EEXIST (File exists)
strace: Process 12286 attached
[pid 12286] +++ exited with 0 +++
strace: Process 12287 attached
strace: Process 12288 attached
[pid 12285] bind(10, {sa_family=AF_UNIX, sun_path=@"magic"}, 8) = 0

It's better to make sure if it's reasonable to create directories in nxt_socket_bind().

So it seems there are two questions

  1. Is it OK to have the bind(2) fail due to a missing directory for the control socket?

If the answer above is no, then

  1. Where is the appropriate place in the code to create the directory?

As for (1), '/run/unit' is a logical place for the control socket (and pid file) and as @alejandro-colomar pointed out, /run is/should be nuked each boot. These days, on Linux at least, /run should be mounted as a tmpfs filesystem and will be nuked automatically at shutdown/reboot.

Having unit create this directory at start up seems not unreasonable.

ac000 avatar Aug 15 '22 15:08 ac000

Let's separate it into two requirements.

  1. If it reasonable to limit the creation of socket directory? For example:
{
       "listeners": {
              "unix:/a/b/c/some.sock": {}
       }
}

If it has root privilege, the above configuration is valid or not? I prefer invalid. unix is not responsible for managing unix socket directory.

  1. If it impossible to meet your needs without creating a directory internally? Is there other way to do it?

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

Let's separate it into two requirements.

1. If it reasonable to limit the creation of socket directory? For example:
{
       "listeners": {
              "unix:/a/b/c/some.sock": {}
       }
}

We're not (yet) modifying that, right? @ac000 this change doesn't affect listeners, right?

I think this patch only changes the behavior for the control socket (because it's the one that blocked my patch, but I think we should also allow that in the long term).

If it has root privilege, the above configuration is valid or not? I prefer invalid. unix is not responsible for managing unix socket directory.

It should be valid for root. The Filesystem Hierarchy Standard requires that socket files should be created under an app-specific directory of /run:

https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

And /run is nuked by the OS at reboot, so it's impossible (except by using rc init scripts, systemd, or a docker entrypoint, depending on your system) to create the directory outside of unitd. I think it's our responsibility to create the directory.

2. If it impossible to meet your needs without creating a directory internally? Is there other way to do it?

The other ways are writing that in the same init scripts that run unitd.

alejandro-colomar avatar Aug 15 '22 15:08 alejandro-colomar

We're not (yet) modifying that, right? @ac000 this change doesn't affect listeners, right?

For devs, nxt_socket_bind() is a generic function, we can't predict who uses it or not.

hongzhidao avatar Aug 15 '22 15:08 hongzhidao

We're not (yet) modifying that, right? @ac000 this change doesn't affect listeners, right?

For devs, nxt_socket_bind() is a generic function, we can't predict who uses it or not.

Whoever needs to run that function should also want to run mkdir. We're not creating a problem to users of the function. We're just making life easier to them.

alejandro-colomar avatar Aug 15 '22 15:08 alejandro-colomar

Let's separate it into two requirements.

1. If it reasonable to limit the creation of socket directory? For example:
{
       "listeners": {
              "unix:/a/b/c/some.sock": {}
       }
}

We're not (yet) modifying that, right? @ac000 this change doesn't affect listeners, right?

Correct.

I think this patch only changes the behavior for the control socket (because it's the one that blocked my patch, but I think we should also allow that in the long term).

Yes, this change only affects the control socket.

ac000 avatar Aug 15 '22 15:08 ac000

OK, so after some code splunking, I have found the following...

In unit there are currently two calls to bind(2)

Controller socket bind(2) path

main()
  nxt_runtime_create()
    nxt_runtime_conf_init()
      nxt_runtime_controller_socket()
        nxt_listen_socket_create()
          nxt_socket_bind()			<== src/nxt_socket.c
            bind(2)

Listen socket bind(2) path

main()
  nxt_runtime_create()
    nxt_runtime_initial_start()
      nxt_main_process_start()
        nxt_main_process_port_create()
          nxt_port_enable()
            nxt_main_port_socket_handler()
              nxt_main_listening_socket()	<== src/nxt_main_porcess.c
                bind(2)

There are two calls to nxt_socket_bind(), the above one and

[
#define nxt_conn_connect(engine, c)                                           \ 
    nxt_work_queue_add(&engine->socket_work_queue, nxt_conn_sys_socket,       \ 
                       c->socket.task, c, c->socket.data) 
nxt_conn_sys_socket()
]
  nxt_conn_socket()
    nxt_socket_bind()

$ grep -rn "nxt_conn_connect(" src/*.c
src/nxt_conn_proxy.c:117:        nxt_conn_connect(task->thread->engine, peer);
src/nxt_conn_proxy.c:224:    nxt_conn_connect(task->thread->engine, p->peer);
src/nxt_conn_proxy.c:846:    nxt_conn_connect(task->thread->engine, peer);
src/nxt_h1proto.c:2192:    nxt_conn_connect(task->thread->engine, c);

So it looks like nxt_socket_bind() is also used for connecting to proxies and HTTP CONNECTs. I'm guessing at least some of them could be Unix domain sockets...

What I think I'll do is factor out the directory making code into a helper function, make the pid file code use it and also call it in nxt_runtime_controller_socket() for the control socket.

ac000 avatar Aug 15 '22 19:08 ac000

This version factors out the directory making code into a new nxt_fs_mkdir_dirname() function in src/nxt_fs.c

I've re-introduced the check for if we find a '/' in the path as this function may be used more generally and it will return NXT_OK in such cases, assuming that it's the current working directory that's being used which will almost certainly exist.

@alejandro-colomar I've removed your Tested-by and S-o-b tags as this has been somewhat reworked...

ac000 avatar Aug 16 '22 00:08 ac000

Just to wonder if these automatically created directories need to be cleaned up after unitd quits?

hongzhidao avatar Aug 16 '22 00:08 hongzhidao

Just to wonder if these automatically created directories need to be cleaned up after unitd quits?

No, they need not be cleaned up. The OS nukes /run at each boot, which is the reason we need to create them. Also, trailing directories are usually not a problem (unless they have the same name as a file that is trying to be created, but that would be a problem of the person that organized that, not ours).

alejandro-colomar avatar Aug 16 '22 09:08 alejandro-colomar

I'm having some doubts: https://linux.codidact.com/posts/286875

alejandro-colomar avatar Aug 16 '22 10:08 alejandro-colomar

Regarding the trailing directories: it's a minor issue, especially in the standard path, which is /run, since it's nuked at reboot.

One might say that in non-standard paths that may be a problem, but usually, having trailing directories is not a big issue.

Still, it may be an issue for some. But then, those "some" shouldn't be asking unit to use such a path (e.g., /bin/unit/control.unit.sock).

But, since we don't cleanup (and I argue that we shouldn't either) socket files, it would be nonesense to cleanup dirs. --control=/bin/control.unit.sock would be as brain-damaged as this, or even more, and that is already possible with current unit. We shouldn't think about what brain-damaged users can do with our program, as long as it's not introducing a serious vulnerability.

Other software, such as avahi, also create the runtime directory for the pid file and a socket within the program, with mkdir(2) calls, as this patch is doing. So it looks good to me.

Reviewed-by: Alejandro Colomar <[email protected]>

I'll add my tested-by tag soon when I test it.

alejandro-colomar avatar Aug 16 '22 14:08 alejandro-colomar