Added (fixed) support for abstract Unix sockets.
Unix domain sockets are normally backed by files in the filesystem. This has historically been problematic when closing and opening again such sockets, since SO_REUSEADDR is ignored for Unix sockets (POSIX left the behavior of SO_REUSEADDR as implementation-defined, and most --if not all-- implementations decided to just ignore this flag).
Many solutions are available for this problem, but all of them have important caveats:
-
unlink(2) the file when it's not needed anymore.
This is not easy, because the process that controls the fd may not be the same process that created the file, and may not have file permissions to remove it.
Further solutions can be applied to that caveat:
-
unlink(2) the file right after creation.
This will remove the pathname from the filesystem without closing the socket (it will continue alive until the last fd is closed). This is not useful for us, since we need the pathname of the socket as its interface.
-
chown(2) or chmod(2) the directory that contains the socket.
For removing a file from the filesystem, a process needs write permissions in the containing directory. We could put sockets in dummy directories that can be chown(2)ed to nobody. This could be dangerous, though, as we don't control the socket names. It is our users who configure the socket name in their configuration, and so it's easy that they don't understand the many implications of not chosing an appropriate socket pathname. A user could unknowingly put the socket in a directory that is not supposed to be owned by user nobody, and if we blindly chown(2) or chmod(2) the directory, we could be creating a big security hole.
-
Ask the main process to remove the socket.
This would require a very complex communication mechanism with the main process, which is not impossible, but let's avoid it if there are simpler solutions.
-
Give the child process the CAP_DAC_OVERRIDE capability.
That is one of the most powerful capabilities. A process with that capability can be considered root for most practical aspects. Even if the capability is disabled for most of the lifetime of the process, there's a slight chance that a malicious actor could activate it and then easily do serious damage to the system.
-
-
unlink(2) the file right before calling bind(2).
This is dangerous because another process (for example, another running instance of unitd(8)), could be using the socket, and removing the pathname from the filesystem would be problematic. To do this correctly, a lot of checks should be added before the actual unlink(2), which is bug-prone, and difficult to do correctly, and atomically.
-
Use abstract-namespace Unix domain sockets.
This is the simplest solution, as it only requires accepting a slightly different syntax (basically a @ prefix) for the socket name, to transform it into a string starting with a null byte ('\0') that the kernel can understand. The patch is minimal.
Since abstract sockets live in an abstract namespace, they don't create files in the filesystem, so there's no need to remove them later. The kernel removes the name when the last fd to it has been closed.
One caveat is that only Linux currently supports this kind of Unix sockets. Of course, a solution to that could be to ask other kernels to implement such a feature.
Another caveat is that filesystem permissions can't be used to control access to the socket file (since, of course, there's no file). Anyone knowing the socket name can access to it. The only method to control access to it is by using network_namespaces(7). Since in unitd(8) we're using 0666 file sockets, abstract sockets should be more insecure than that (anyone can already read/write to the listener sockets).
-
Ask the kernel to implement an simpler way to unlink(2) socket files when they are not needed anymore. I've suggested that to the [email protected] mailing list, in: <lore.kernel.org/linux-fsdevel/[email protected]/T>
In this commit, I decided to go for the easiest/simplest solution.
This closes #669 issue on GitHub.
One detail:
Only the first @ is transformed into a \0. If the user specifies further @s, they will be treated literally.
If the user specifies \0s in the middle of the string, it will probably screw the JSON parser, so I didn't even care to add support for it, nor test it. We just don't support such broken names.
What I've already tested is configuring the following:
{
"listeners": {
"unix:@magic": {
"pass": "routes",
"client_ip": {
"header": "X-Forwarded-For",
"source": "unix"
}
}
},
"routes": [{
"action": {
"share": "/home/alx/srv/www/$uri"
}
}]
}
And then:
$ sudo curl -X PUT --data-binary @- --unix-socket /usr/local/control.unit.sock localhost/config/ <sock_abstract_static.json
{
"success": "Reconfiguration done."
}
And some sanity check:
$ ss -l | grep magic
u_str LISTEN 0 511 @magic@ 54141 * 0
And also, I tested a simple C program to connect to the socket and write to it some random stuff, and it (kind of) worked: the socket name has a trailing \0, so I had to connect to \0magic\0. I'm working on fixing that.
This is yet another even simpler alternative to #670 and #735 .
I'm trying to test this with curl, for which I need a socket name that doesn't end in \0. For that, I'm trying to do sa->socklen--;, but for some reason, the code is completely ignoring me, and I need someone else look into my code for a moment. I'm blind.
So, I have the following debug changes, and even though I'm setting the length to 3 (and sa_family_t is already 2 bytes, so sun_path should be just one \0.
Still, I see 9 (2 + 7) bytes in the log: ALX: 9; <^@-m-a-g-i-c-^@>.
$ git diff
diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index 0527d08f..4e1df140 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1147,10 +1147,21 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
#if (NXT_HAVE_UNIX_DOMAIN)
if (sa->u.sockaddr.sa_family == AF_UNIX) {
if (sa->u.sockaddr_un.sun_path[0] == '@') {
+ sa->socklen=3;
sa->u.sockaddr_un.sun_path[0] = '\0';
}
}
#endif
+fprintf(stderr, "ALX: %i; <%c-%c-%c-%c-%c-%c-%c>\n", (int) sa->socklen,
+sa->u.sockaddr_un.sun_path[0],
+sa->u.sockaddr_un.sun_path[1],
+sa->u.sockaddr_un.sun_path[2],
+sa->u.sockaddr_un.sun_path[3],
+sa->u.sockaddr_un.sun_path[4],
+sa->u.sockaddr_un.sun_path[5],
+sa->u.sockaddr_un.sun_path[6]);
ret = bind(s, &sa->u.sockaddr, sa->socklen);
And ss(8) still sees also @magic@.
Just take a glance.
- In
nxt_sockaddr_unix_parse(), there is already handling with abstract sockets.
#if (NXT_LINUX)
/*
* Linux unix(7):
*
* abstract: an abstract socket address is distinguished by the fact
* that sun_path[0] is a null byte ('\0'). The socket's address in
* this namespace is given by the additional bytes in sun_path that
* are covered by the specified length of the address structure.
* (Null bytes in the name have no special significance.)
*/
if (path[0] == '@') {
path[0] = '\0';
socklen--;
}
#endif
- Alternative check.
if (sa->u.sockaddr.sa_family == AF_UNIX) {
&& sa->u.sockaddr_un.sun_path[0] == '\0')
{
nxt_sockaddr_unix_parse
Yeah, it seems it wasn't entering my conditional. Heh!
Well, then, we have still some issue. There's a trailing \0, even though the code also did socklen--; as I was trying to do. I'll check why we still have that trailing byte. curl(1) doesn't support that.
Just take a glance.
1. In `nxt_sockaddr_unix_parse()`, there is already handling with abstract sockets.#if (NXT_LINUX) /* * Linux unix(7): * * abstract: an abstract socket address is distinguished by the fact * that sun_path[0] is a null byte ('\0'). The socket's address in * this namespace is given by the additional bytes in sun_path that * are covered by the specified length of the address structure. * (Null bytes in the name have no special significance.) */ if (path[0] == '@') { path[0] = '\0'; socklen--; } #endif2. Alternative check.if (sa->u.sockaddr.sa_family == AF_UNIX) { && sa->u.sockaddr_un.sun_path[0] == '\0') {
Although, that code doesn't work. The listener doesn't support abstract sockets previous to my patch. What was that code used for? Or was it dead code?
It's for disabling chmod() on path which starts with \0.
$ git diff diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c index 0527d08f..4e1df140 100644 --- a/src/nxt_main_process.c +++ b/src/nxt_main_process.c @@ -1147,10 +1147,21 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls) #if (NXT_HAVE_UNIX_DOMAIN) if (sa->u.sockaddr.sa_family == AF_UNIX) { if (sa->u.sockaddr_un.sun_path[0] == '@') { + sa->socklen=3; sa->u.sockaddr_un.sun_path[0] = '\0'; } }
I can't get the code inside the if (sa->u.sockaddr_un.sun_path[0] == '@') { to trigger
Doing
sa->u.sockaddr_un.sun_path[0] = '\0';
sa->socklen--;
before that if check makes things work. Although I think we may have tickled a bug somewhere as now ss only shows @magic as magic, even though the bind(2) is doing
bind(10, {sa_family=AF_UNIX, sun_path=@"magic"}, 8) = 0
and it shows up correctly in /proc/net/unix
$ grep magic /proc/net/unix
00000000c72b4d32: 00000002 00000000 00010000 0001 01 7229023 @magic
but
$ ss -l | grep magic
u_str LISTEN 0 511 magic 7229392
But at least it works
$ curl --abstract-unix-socket magic http://localhost/
Hello, Python on Unit!
Unix domain sockets are normally backed by files in the filesystem. This has historically been problematic when closing and opening again such sockets, since SO_REUSEADDR is ignored for Unix sockets (POSIX left the behavior of SO_REUSEADDR
s/behavior/behaviour/
* unlink(2) the file right after creation. This will remove the pathname from the filesystem without closing the socket (it will continue alive until the last fd
s/alive/to live/
* unlink(2) the file right before calling bind(2). This is dangerous because another process (for example, another running instance of unitd(8)), could be using the socket, and removing the pathname from the filesystem would be problematic. To do this correctly, a lot of checks should be added before the actual unlink(2), which is bug-prone, and difficult to do
s/bug/error/
* Use abstract-namespace Unix domain sockets. This is the simplest solution, as it only requires accepting a slightly different syntax (basically a @ prefix) for the socket name, to transform it into a string starting with a null byte ('\0') that the kernel can understand. The patch is minimal. Since abstract sockets live in an abstract namespace, they don't create files in the filesystem, so there's no need to remove them later. The kernel removes the name when the last fd to it has been closed. One caveat is that only Linux currently supports this kind of Unix sockets. Of course, a solution to that could be to ask other kernels to implement such a feature. Another caveat is that filesystem permissions can't be used to control access to the socket file (since, of course, there's no file). Anyone knowing the socket name can access to it. The only method to control access to it is by using network_namespaces(7). Since in unitd(8) we're using 0666 file sockets, abstract sockets should be more insecure than that
s/be more/be no more/
You can also use SO_PEERCRED on Unix Domain Sockets.
(anyone can already read/write to the listener sockets). * Ask the kernel to implement an simpler way to unlink(2) socket
s/an/a/
s/behavior/behaviour/
Unfortunately, we default to US English :(
before that if check makes things work. Although I think we may have tickled a bug somewhere as now ss only shows @magic as magic, even though the bind(2) is doing [...]
But at least it works
$ curl --abstract-unix-socket magic http://localhost/ Hello, Python on Unit!
That's very weird. A bug in ss(8)? Could you please try my cli_a.c program to try to write to that magic socket? I'd like to make sure that the kernel has it right (as \0magic). Then maybe we can report a ss(8) bug.
Another test could be to try to run srv_a.c, which should fail to run if \0magic is already bound.
I don't trust curl(1) maybe doing weird stuff. It's too complex of a program.
It's a bug, in xterm!
I've seen this before, and it just occurred to me again now.
Lets see
xterm at 80 chars wide.
$ ss -l | grep @ | grep magic
u_str LISTEN 0 511 magic 7869871 * 0
The 'm' is at the 80 char mark.
So we got magic even though there is apparently no '@'!
Now lets extend xterm out a little (even just by 1 char).
$ ss -l | grep @ | grep magic
u_str LISTEN 0 511 @magic 7869871 * 0
The '@' is now at the 80 char mark.
There we go. Previously I'd already had xterm wider for the ss output so didn't notice this then.
I've noticed this problem in xterm before, particularly with gcc output.
#if (NXT_LINUX)
/* * Linux unix(7): * * abstract: an abstract socket address is distinguished by the fact * that sun_path[0] is a null byte ('\0'). The socket's address in * this namespace is given by the additional bytes in sun_path that * are covered by the specified length of the address structure. * (Null bytes in the name have no special significance.) */ if (path[0] == '@') { path[0] = '\0'; socklen--; }
This code doesn't seem to be executed, at least not before we call bind(2).
2. Alternative check.if (sa->u.sockaddr.sa_family == AF_UNIX) { && sa->u.sockaddr_un.sun_path[0] == '\0') {
Yeah, that's required. Here's a minimal patch on top of master to make it work.
diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index 03761a10..a8de8f89 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1143,6 +1143,12 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
#endif
+ if (sa->u.sockaddr.sa_family == AF_UNIX
+ && *sa->u.sockaddr_un.sun_path == '\0')
+ {
+ sa->socklen--;
+ }
+
if (bind(s, &sa->u.sockaddr, sa->socklen) != 0) {
err = nxt_errno;
@@ -1187,7 +1193,9 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
#if (NXT_HAVE_UNIX_DOMAIN)
- if (sa->u.sockaddr.sa_family == AF_UNIX) {
+ if (sa->u.sockaddr.sa_family == AF_UNIX
+ && *sa->u.sockaddr_un.sun_path != '\0')
+ {
char *filename;
mode_t access;
NOTE: I'm not saying this is correct, but it shows the code as is, is not far off (though the socket name does show up in the config as \u0000magic)
This code doesn't seem to be executed, at least not before we call bind(2).
Correct, the related APIs have not been used yet.
Digging, I found a lot of dead code that was accidentally left in some of Igor's refactors. I removed it all.
I found more or less why we are getting the extra length. See the following debug log:
alx@asus5775:/usr/local$ sudo cat unit.log | tr '\0' '#' | strings | grep ALX
ALX: nxt_sockaddr_text():293: 1
ALX: <#><
ALX: nxt_sockaddr_text():305: 1
ALX: <u><n><i><x><:><
ALX: nxt_sockaddr_unix_parse():631: 7
ALX: <@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():639: 6
ALX: <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 6
ALX: <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 6
ALX: <u><n><i><x><:><@>
ALX: nxt_sockaddr_unix_parse():631: 7
ALX: <#><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():639: 7
ALX: <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 7
ALX: <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 7
ALX: <u><n><i><x><:><@>
Got that with the following debug code:
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 8220590e..75c1efde 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -141,6 +141,10 @@ nxt_sockaddr_create(nxt_mp_t *mp, struct sockaddr *sockaddr, socklen_t length,
#endif
}
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) size-2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", sockaddr->sa_data[i]);
+fprintf(stderr, "\n");
#endif /* NXT_HAVE_UNIX_DOMAIN */
@@ -286,6 +290,10 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
#if (NXT_LINUX)
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) sa->socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", p[i]);
+fprintf(stderr, "\n");
if (p[0] == '\0') {
size_t length;
@@ -294,8 +302,16 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
p = nxt_sprintf(start, end, "unix:@%*s", length - 1, p + 1);
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) length);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
} else {
p = nxt_sprintf(start, end, "unix:%s", p);
+fprintf(stderr, "ALX: %s():%i:\n ALX: ", __func__, __LINE__);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
}
#else /* !(NXT_LINUX) */
@@ -612,10 +628,18 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
* are covered by the specified length of the address structure.
* (Null bytes in the name have no special significance.)
*/
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
if (path[0] == '@') {
path[0] = '\0';
socklen--;
}
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
#endif
The explanation is that we're going twice through nxt_sockaddr_unix_parse(). What is not clear to me is why are we going twice through that function. The solution seems also easy (although I'm not still convinced by it; it looks like something's wrong).
The easy fix:
if (path[0] == '@') {
path[0] = '\0';
socklen--;
}
should be transformed into
if (path[0] == '@') {
path[0] = '\0';
}
if (path[0] == '\0') {
socklen--;
}
But, as I hinted, the fix should probably be to make sure that we only run through that function once.
I configured with the following config file:
$ cat sock_static.json
{
"listeners": {
"unix:@magic": {
"pass": "routes",
"client_ip": {
"header": "X-Forwarded-For",
"source": "unix"
}
}
},
"routes": [{
"action": {
"share": "/home/alx/srv/www/$uri"
}
}]
}
s/behavior/behaviour/
Wontfix :)
s/alive/to live/ s/bug/error/ s/be more/be no more/ s/an/a/
Fixed.
Thanks!
I found the following:
ALX: nxt_conf_vldt_listener():1484
ALX: nxt_sockaddr_parse():544
ALX: nxt_sockaddr_parse_optport():566
ALX: nxt_sockaddr_unix_parse():635: 7
ALX: <@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():643: 6
ALX: <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 6
ALX: <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 6
ALX: <u><n><i><x><:><@>
ALX: nxt_sockaddr_parse_optport():585
ALX: nxt_sockaddr_parse():555
ALX: nxt_conf_vldt_listener():1497
ALX: nxt_router_conf_data_handler():739
ALX: nxt_router_conf_create():1483
ALX: nxt_router_socket_conf():2432
ALX: nxt_sockaddr_parse():544
ALX: nxt_sockaddr_parse_optport():566
ALX: nxt_sockaddr_unix_parse():635: 7
ALX: <^@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():643: 7
ALX: <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 7
ALX: <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 7
ALX: <u><n><i><x><:><@>
ALX: nxt_sockaddr_parse_optport():585
ALX: nxt_sockaddr_parse():555
ALX: nxt_router_socket_conf():2500
ALX: nxt_router_conf_create():1968
ALX: nxt_router_conf_data_handler():822
with the following debug code:
diff --git a/src/nxt_conf_validation.c b/src/nxt_conf_validation.c
index 3e89d775..9e5ca414 100644
--- a/src/nxt_conf_validation.c
+++ b/src/nxt_conf_validation.c
@@ -1481,6 +1481,7 @@ nxt_conf_vldt_listener(nxt_conf_validation_t *vldt, nxt_str_t *name,
{
nxt_int_t ret;
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
sa = nxt_sockaddr_parse(vldt->pool, name);
if (nxt_slow_path(sa == NULL)) {
@@ -1493,6 +1494,7 @@ nxt_conf_vldt_listener(nxt_conf_validation_t *vldt, nxt_str_t *name,
if (ret != NXT_OK) {
return ret;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return nxt_conf_vldt_object(vldt, value, nxt_conf_vldt_listener_members);
}
@@ -1752,6 +1754,7 @@ nxt_conf_vldt_proxy(nxt_conf_validation_t *vldt, nxt_conf_value_t *value,
nxt_str_t name;
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
nxt_conf_get_string(value, &name);
if (nxt_str_start(&name, "http://", 7)) {
@@ -1760,9 +1763,11 @@ nxt_conf_vldt_proxy(nxt_conf_validation_t *vldt, nxt_conf_value_t *value,
sa = nxt_sockaddr_parse(vldt->pool, &name);
if (sa != NULL) {
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return NXT_OK;
}
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return nxt_conf_vldt_error(vldt, "The \"proxy\" address is invalid \"%V\"",
&name);
@@ -3068,6 +3073,7 @@ nxt_conf_vldt_server(nxt_conf_validation_t *vldt, nxt_str_t *name,
{
nxt_int_t ret;
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
ret = nxt_conf_vldt_type(vldt, name, value, NXT_CONF_VLDT_OBJECT);
@@ -3081,6 +3087,7 @@ nxt_conf_vldt_server(nxt_conf_validation_t *vldt, nxt_str_t *name,
return nxt_conf_vldt_error(vldt, "The \"%V\" is not valid "
"server address.", name);
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return nxt_conf_vldt_object(vldt, value,
nxt_conf_vldt_upstream_server_members);
diff --git a/src/nxt_http_proxy.c b/src/nxt_http_proxy.c
index 6aa3aabb..3dc59b26 100644
--- a/src/nxt_http_proxy.c
+++ b/src/nxt_http_proxy.c
@@ -58,6 +58,7 @@ nxt_http_proxy_init(nxt_mp_t *mp, nxt_http_action_t *action,
nxt_upstream_t *up;
nxt_upstream_proxy_t *proxy;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
sa = NULL;
nxt_conf_get_string(acf->proxy, &name);
@@ -96,6 +97,7 @@ nxt_http_proxy_init(nxt_mp_t *mp, nxt_http_action_t *action,
action->handler = nxt_http_proxy;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return NXT_OK;
}
diff --git a/src/nxt_http_request.c b/src/nxt_http_request.c
index a16340de..10dd5d48 100644
--- a/src/nxt_http_request.c
+++ b/src/nxt_http_request.c
@@ -468,6 +468,7 @@ nxt_http_request_client_ip_sockaddr(nxt_http_request_t *r, u_char *start,
{
nxt_str_t addr;
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
addr.start = start;
addr.length = len;
@@ -498,6 +499,7 @@ nxt_http_request_client_ip_sockaddr(nxt_http_request_t *r, u_char *start,
return NULL;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return sa;
}
diff --git a/src/nxt_router.c b/src/nxt_router.c
index e1f4f6da..7f0b0832 100644
--- a/src/nxt_router.c
+++ b/src/nxt_router.c
@@ -736,6 +736,7 @@ nxt_router_conf_data_handler(nxt_task_t *task, nxt_port_recv_msg_t *msg)
nxt_int_t ret;
nxt_port_t *port;
nxt_router_temp_conf_t *tmcf;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
port = nxt_runtime_port_find(task->thread->runtime,
msg->port_msg.pid,
@@ -818,6 +819,7 @@ cleanup:
nxt_fd_close(msg->fd[0]);
msg->fd[0] = -1;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
}
@@ -1478,6 +1480,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_app_lang_module_t *lang;
nxt_router_app_conf_t apcf;
nxt_router_listener_conf_t lscf;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
static nxt_str_t http_path = nxt_string("/settings/http");
static nxt_str_t applications_path = nxt_string("/applications");
@@ -1962,6 +1965,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_queue_add(&deleting_sockets, &router->sockets);
nxt_queue_init(&router->sockets);
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return NXT_OK;
@@ -2425,6 +2429,7 @@ nxt_router_socket_conf(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_sockaddr_t *sa;
nxt_socket_conf_t *skcf;
nxt_listen_socket_t *ls;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
sa = nxt_sockaddr_parse(tmcf->mem_pool, name);
if (nxt_slow_path(sa == NULL)) {
@@ -2492,6 +2497,7 @@ nxt_router_socket_conf(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_memcpy(skcf->sockaddr, sa, size);
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return skcf;
}
diff --git a/src/nxt_runtime.c b/src/nxt_runtime.c
index 46955f1c..67a5fa6a 100644
--- a/src/nxt_runtime.c
+++ b/src/nxt_runtime.c
@@ -760,6 +760,7 @@ nxt_runtime_conf_init(nxt_task_t *task, nxt_runtime_t *rt)
nxt_sockaddr_t *sa;
nxt_file_name_str_t file_name;
const nxt_event_interface_t *interface;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
rt->daemon = 1;
rt->engine_connections = 256;
@@ -904,6 +905,7 @@ nxt_runtime_conf_init(nxt_task_t *task, nxt_runtime_t *rt)
if (nxt_runtime_controller_socket(task, rt) != NXT_OK) {
return NXT_ERROR;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return NXT_OK;
}
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 8220590e..6f476e8a 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -141,6 +141,10 @@ nxt_sockaddr_create(nxt_mp_t *mp, struct sockaddr *sockaddr, socklen_t length,
#endif
}
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) size-2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", sockaddr->sa_data[i]);
+fprintf(stderr, "\n");
#endif /* NXT_HAVE_UNIX_DOMAIN */
@@ -286,6 +290,10 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
#if (NXT_LINUX)
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) sa->socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", p[i]);
+fprintf(stderr, "\n");
if (p[0] == '\0') {
size_t length;
@@ -294,8 +302,16 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
p = nxt_sprintf(start, end, "unix:@%*s", length - 1, p + 1);
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) length);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
} else {
p = nxt_sprintf(start, end, "unix:%s", p);
+fprintf(stderr, "ALX: %s():%i:\n ALX: ", __func__, __LINE__);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
}
#else /* !(NXT_LINUX) */
@@ -525,6 +541,7 @@ nxt_sockaddr_parse(nxt_mp_t *mp, nxt_str_t *addr)
{
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
sa = nxt_sockaddr_parse_optport(mp, addr);
if (sa != NULL
@@ -535,6 +552,7 @@ nxt_sockaddr_parse(nxt_mp_t *mp, nxt_str_t *addr)
"The address \"%V\" must specify a port.", addr);
return NULL;
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return sa;
}
@@ -545,6 +563,7 @@ nxt_sockaddr_parse_optport(nxt_mp_t *mp, nxt_str_t *addr)
{
nxt_sockaddr_t *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
if (addr->length == 0) {
nxt_thread_log_error(NXT_LOG_ERR, "socket address cannot be empty");
return NULL;
@@ -563,6 +582,7 @@ nxt_sockaddr_parse_optport(nxt_mp_t *mp, nxt_str_t *addr)
if (nxt_fast_path(sa != NULL)) {
nxt_sockaddr_text(sa);
}
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return sa;
}
@@ -612,10 +632,18 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
* are covered by the specified length of the address structure.
* (Null bytes in the name have no special significance.)
*/
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
if (path[0] == '@') {
path[0] = '\0';
socklen--;
}
+fprintf(stderr, "ALX: %s():%i: %i\n ALX: ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
#endif
diff --git a/src/nxt_upstream_round_robin.c b/src/nxt_upstream_round_robin.c
index 31e2f48a..20771e88 100644
--- a/src/nxt_upstream_round_robin.c
+++ b/src/nxt_upstream_round_robin.c
@@ -51,6 +51,7 @@ nxt_upstream_round_robin_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_sockaddr_t *sa;
nxt_conf_value_t *servers_conf, *srvcf, *wtcf;
nxt_upstream_round_robin_t *urr;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
static nxt_str_t servers = nxt_string("servers");
static nxt_str_t weight = nxt_string("weight");
@@ -115,6 +116,7 @@ nxt_upstream_round_robin_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
upstream->proto = &nxt_upstream_round_robin_proto;
upstream->type.round_robin = urr;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
return NXT_OK;
}
So, we first parse the string when we validate it, and then again in the router when we use it. Knowing this, I don't think it can hurt to do the simple patch I suggested yesterday. I'm worried that some path of the existing code might not be prepared to handle a '\0' in the middle of the socket name. However, since that code already existed (i.e., we wouldn't be introducing new bugs or vulnerabilities), I think it's simpler to leave it untouched (assume it still works), and only change it if we find a bug, rather than trying to make sure that it's correct (which would be much harder). As long as my tests don't detect anything weird, I'm happy with it.
What do you think about it?
A general inspection of the code looks good to me. In the conf validation, we parse the socket address to check it's a valid one, and discard the result. Later in the router, we parse it again to use it. Nothing very weird. We could maybe keep the parsed data to not have to parse it twice... but that would require work that may be unnecessary, and is not vital, I think; there are no obvious places that would be unreasonably slow in that code.
Tested:
$ ss -l | grep magic
u_str LISTEN 0 511 @magic 32357 * 0
$ curl --abstract-unix-socket magic http://localhost/
idx
@echolimazulu
This code should work for you, I guess. Now I'll try to make NGINX work with abstract sockets too.
This is certainly the simplest and safest fix and if you wanted to have it stored in the config as @name than that could be done separately as it's likely to be more intrusive, so I'd say this was good to go.
Yep, I have no preference between \u0000 and @ in the state file. If we find that it causes trouble someday, we can fix it. For now, it's not broken, so let's keep the patch to the very minimum :)
@alejandro-colomar what does the "listeners" object look like when you query the config with an abstract socket?
$ sudo curl --unix-socket /usr/local/control.unit.sock localhost/config
{
"listeners": {
"unix:\u0000magic": {
"pass": "routes",
"client_ip": {
"header": "X-Forwarded-For",
"source": "unix"
}
}
},
"routes": [
{
"action": {
"share": "/home/alx/srv/www/$uri"
}
}
]
}
Seems correct UTF-8.
This JSON is valid, and should be fine to configure Unit with it. I just reconfigured Unit with that output, and it accepts it.
@lcrilly
Tested:
alx@asus5775:~$ sudo curl --unix-socket /usr/local/control.unit.sock localhost/config
{
"listeners": {
"unix:\u0000magic": {
"pass": "routes",
"client_ip": {
"header": "X-Forwarded-For",
"source": "unix"
}
}
},
"routes": [
{
"action": {
"share": "/home/alx/srv/www/$uri"
}
}
]
}
alx@asus5775:~$ sudo curl --unix-socket /usr/local/control.unit.sock localhost/config > tmp.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 252 100 252 0 0 463k 0 --:--:-- --:--:-- --:--:-- 246k
alx@asus5775:~$ sudo curl -X PUT -d @tmp.json --unix-socket /usr/local/control.unit.sock localhost/config
{
"success": "Reconfiguration done."
}
alx@asus5775:~$ ss -l | grep magic
u_str LISTEN 0 511 @magic 36684 * 0
alx@asus5775:~$ curl --abstract-unix-socket magic http://localhost/
idx
In case we want to store a @ in the state file (so that if a user queries unitd to read the current configuration, it will read the @), the diff is:
$ git diff
diff --git a/src/nxt_conf_validation.c b/src/nxt_conf_validation.c
index 3e89d775..62cada3e 100644
--- a/src/nxt_conf_validation.c
+++ b/src/nxt_conf_validation.c
@@ -1480,9 +1480,11 @@ nxt_conf_vldt_listener(nxt_conf_validation_t *vldt, nxt_str_t *name,
nxt_conf_value_t *value)
{
nxt_int_t ret;
+ nxt_str_t str;
nxt_sockaddr_t *sa;
- sa = nxt_sockaddr_parse(vldt->pool, name);
+ nxt_str_dup(vldt->pool, &str, name);
+ sa = nxt_sockaddr_parse(vldt->pool, &str);
if (nxt_slow_path(sa == NULL)) {
return nxt_conf_vldt_error(vldt,
"The listener address \"%V\" is invalid.",
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 63d3ffb3..8220590e 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -612,11 +612,8 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
* are covered by the specified length of the address structure.
* (Null bytes in the name have no special significance.)
*/
- switch (path[0]) {
- case '@':
+ if (path[0] == '@') {
path[0] = '\0';
- /* fall through */
- case '\0':
socklen--;
}
We could still keep the switch/case, in case a user wants to specify a \u0000, which also works, but we don't strictly need it, so I changed it in this diff to show that it's not strictly needed.
What do you say? Should we apply something of that diff?
Now, it shows:
$ sudo curl --unix-socket /usr/local/control.unit.sock localhost/config
{
"listeners": {
"unix:@magic": {
"pass": "routes",
"client_ip": {
"header": "X-Forwarded-For",
"source": "unix"
}
}
},
"routes": [
{
"action": {
"share": "/home/alx/srv/www/$uri"
}
}
]
}
Or, if we want to avoid dealing with \0s completely, we could use a different approach:
stop transforming @ into \0 when parsing the socket name, and do it at the time we call bind(2) (which was my first idea). The good thing about this is that it's safer: there are no weird null bytes in our strings. We would always have @ and do @->\0 right before bind(2), and undo it right after bind(2).
I have no strong preference, I guess showing '@' in the config is more 'user friendly'. If we do that, then which ever way we go (I think I'm leaning towards what you have above, rather than flip/flopping it) should be done as a separate commit.
What you currently have is legitimate on its own in that it simply makes abstract sockets work.
I agree that I won't be touching the existing commits. I like them very much. Anything else would be added as a separate commit, and in a separate review process.
For completeness, I'll also show the flip-flop approach:
diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index d3b49319..47419b6c 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1094,6 +1094,7 @@ nxt_main_port_socket_handler(nxt_task_t *task, nxt_port_recv_msg_t *msg)
static nxt_int_t
nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
{
+ int ret;
nxt_err_t err;
nxt_socket_t s;
@@ -1143,7 +1144,25 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
#endif
- if (bind(s, &sa->u.sockaddr, sa->socklen) != 0) {
+#if (NXT_HAVE_UNIX_DOMAIN)
+ if (sa->u.sockaddr.sa_family == AF_UNIX
+ && sa->u.sockaddr_un.sun_path[0] == '@')
+ {
+ sa->u.sockaddr_un.sun_path[0] = '\0';
+ }
+#endif
+
+ ret = bind(s, &sa->u.sockaddr, sa->socklen);
+
+#if (NXT_HAVE_UNIX_DOMAIN)
+ if (sa->u.sockaddr.sa_family == AF_UNIX
+ && sa->u.sockaddr_un.sun_path[0] == '\0')
+ {
+ sa->u.sockaddr_un.sun_path[0] = '@';
+ }
+#endif
+
+ if (ret != 0) {
err = nxt_errno;
#if (NXT_HAVE_UNIX_DOMAIN)
@@ -1188,7 +1207,7 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
#if (NXT_HAVE_UNIX_DOMAIN)
if (sa->u.sockaddr.sa_family == AF_UNIX &&
- sa->u.sockaddr_un.sun_path[0] != '\0')
+ sa->u.sockaddr_un.sun_path[0] != '@')
{
char *filename;
mode_t access;
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 63d3ffb3..5f45dee5 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -612,11 +612,7 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
* are covered by the specified length of the address structure.
* (Null bytes in the name have no special significance.)
*/
- switch (path[0]) {
- case '@':
- path[0] = '\0';
- /* fall through */
- case '\0':
+ if (path[0] == '@') {
socklen--;
}
The good thing about the flip-flop is that we simplify the following code a little bit:
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 5f45dee5..547628ac 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -292,7 +292,7 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
/* Linux abstract socket address has no trailing zero. */
length = sa->socklen - offsetof(struct sockaddr_un, sun_path);
- p = nxt_sprintf(start, end, "unix:@%*s", length - 1, p + 1);
+ p = nxt_sprintf(start, end, "unix:%*s", length, p);
} else {
p = nxt_sprintf(start, end, "unix:%s", p);
And more importantly, that we don't have a slight chance of having bugs in other parts of the code that may not have had that detail into account. And also, we avoid the string duplication.