linux icon indicating copy to clipboard operation
linux copied to clipboard

UDP socket control

Open l0kod opened this issue 1 year ago • 11 comments

We can now control TCP actions (bind(2) and connect(2)), and it would be useful to have a similar semantic for UDP. It's a bit tricky because of the datagram nature of UDP though.

However, it should not be an issue to check sendto(2) and recvfrom(2) for UDP because performant-sensitive applications should use connect/bind and send/recv instead, and we can check connect/bind without checking send/recv. Indeed, bind and connect can be used to configure an UDP socket, even if it is not a connected protocol. This approach enables to limit the number of kernel checks and copies.

We first need to check real use cases to validate these assumptions...

See https://lore.kernel.org/all/[email protected]/

Cc @BoardzMaster

l0kod avatar Jan 18 '24 20:01 l0kod

I found out that send is used in linux kernel implementation of AFS when server sends some UDP messages (e.g. ACK) in response to a client's request. I'm not sure if there could be any overhead due to landlock LSM, but this could be an example of latency-sensitive app, that doesn't use pre-connected send calls.

In general, server can be used to respond promptly to requests from different clients and it would be unreasonable to establish pseudo-connection for every client. What do you think about this case?

sm1ling-knight avatar Apr 01 '24 14:04 sm1ling-knight

I found out that send is used in linux kernel implementation of AFS when server sends some UDP messages (e.g. ACK) in response to a client's request. I'm not sure if there could be any overhead due to landlock LSM, but this could be an example of latency-sensitive app, that doesn't use pre-connected send calls.

Do you mean sendto? send should be OK. If sendto is used, the question is: could the related code be rewritten to use send instead (which may improve performance for the app)?

In general, server can be used to respond promptly to requests from different clients and it would be unreasonable to establish pseudo-connection for every client. What do you think about this case?

If a request is received by a server, wouldn't it be possible to just call send on the related socket?

l0kod avatar Apr 02 '24 17:04 l0kod

I found out that send is used in linux kernel implementation of AFS when server sends some UDP messages (e.g. ACK) in response to a client's request. I'm not sure if there could be any overhead due to landlock LSM, but this could be an example of latency-sensitive app, that doesn't use pre-connected send calls.

Do you mean sendto? send should be OK.

They use udp_sendmsg() under the hood which is equivalent to sendto in our case (since they both specify destination address for each call).

If sendto is used, the question is: could the related code be rewritten to use send instead (which may improve performance for the app)?

Yes, it can be rewritten using pre-connected mechanism, but i don't know how this will affect performance. In some cases, server could send only 1-2 ACK packets to the client, so it may be unreasonable to call connect for every client.

In general, server can be used to respond promptly to requests from different clients and it would be unreasonable to establish pseudo-connection for every client. What do you think about this case?

If a request is received by a server, wouldn't it be possible to just call send on the related socket?

Server can use single socket both for receiving and responding to messages. So, it must specify the destination address for each client via connect or sendto/sendmsg calls.

sm1ling-knight avatar Apr 04 '24 15:04 sm1ling-knight

Btw we may have some issues with UDP multicast sockets. From man connect(2):

If the socket sockfd is of type SOCK_DGRAM, then addr is the
address to which datagrams are sent by default, and the only
address from which datagrams are received.

This means that restricted sandbox can only use sendto calls for multicast sockets. What do you think?

sm1ling-knight avatar Apr 04 '24 16:04 sm1ling-knight

Btw we may have some issues with UDP multicast sockets. From man connect(2):

If the socket sockfd is of type SOCK_DGRAM, then addr is the
address to which datagrams are sent by default, and the only
address from which datagrams are received.

This means that restricted sandbox can only use sendto calls for multicast sockets. What do you think?

Yes, we'll need to support sendto and recvfrom. Anyway, I think the performance impact might be OK because UDP sandboxing will be opt-in (as other firewalls), and performance-sensitive services may not want to use this Landlock feature. After some benchmarks (#24), we'll need to think about performance improvements. #1 should help, but we may want to look at LRU, bloom filters, and other algorithms (taking into account #16).

l0kod avatar Apr 05 '24 13:04 l0kod

Hi Mickaël,

I've read the linked discussion threads, and I can try working on a first shot at the problem if that's still ok with your roadmap, e.g. considering #16 ? My current understanding is roughly:

  • add a couple new LANDLOCK_ACCESS_NET_RECV_UDP and LANDLOCK_ACCESS_NET_SEND_UDP handled accesses based on struct landlock_net_port_attr
  • update the socket_connect hook accordingly
  • add socket_sendmsg and socket_recvmsg hooks to filter sendto, sendmsg, sendmmsg, recvfrom, recvmsg, recvmmsg whilst letting calls with no address specified (e.g. from send, recv) through

About performance: I haven't had time to look at AFS, but more generally if anyone has potential applications (clients or servers) that could have issues with such a patch, I'd be interested to benchmark it. As a side note, for the usecase where an app needs to send/recv messages to too many hosts to be able to maintain a socket opened to each one, and has to send enough messages to each that LSM checks start having an impact, recvmmsg and sendmmsg both cache LSM verdicts and look like potential options.

I'll read up on general kernel dev in parallel, worst that can happen is I realize how much complexity is hidden within a few weeks and come back empty handed.

Cheers :)

mtth-bfft avatar Jun 21 '24 22:06 mtth-bfft

Hi Mickaël,

Hi Matthieu!

I've read the linked discussion threads, and I can try working on a first shot at the problem if that's still ok with your roadmap, e.g. considering #16 ?

Sure! #16 is not a blocker.

My current understanding is roughly:

  • add a couple new LANDLOCK_ACCESS_NET_RECV_UDP and LANDLOCK_ACCESS_NET_SEND_UDP handled accesses based on struct landlock_net_port_attr
  • update the socket_connect hook accordingly
  • add socket_sendmsg and socket_recvmsg hooks to filter sendto, sendmsg, sendmmsg, recvfrom, recvmsg, recvmmsg whilst letting calls with no address specified (e.g. from send, recv) through

Yes, that's the idea.

About performance: I haven't had time to look at AFS, but more generally if anyone has potential applications (clients or servers) that could have issues with such a patch, I'd be interested to benchmark it. As a side note, for the usecase where an app needs to send/recv messages to too many hosts to be able to maintain a socket opened to each one, and has to send enough messages to each that LSM checks start having an impact, recvmmsg and sendmmsg both cache LSM verdicts and look like potential options.

We could indeed rely on caching, but let's ignore this optimization for now. :wink:

About the performance impact, I think we should start with a first implementation and do some benchmarks to see the impact. I think it should be OK for most use cases, and controlling UDP (like other Landlock restrictions) will always be optional anyway. @sm1ling-knight is working on #24, which should help benchmark worse cases.

A simple optimization for pseudo-connected UDP sockets would be to tag the sockets when calling bind() and connect(), and check this tag on the sendmsg() and recvmsg() LSM hooks. That would not work for pure sendmsg() and recvmsg() use cases, but that should not be any worse than other access controls. :shrug:

This makes me think about more appropriate access rights: LANDLOCK_ACCESS_NET_BIND_UDP (explicit call to bind()) and LANDLOCK_ACCESS_NET_SENDTO_UDP (controlling both connect() and sendto() calls).

I'll read up on general kernel dev in parallel, worst that can happen is I realize how much complexity is hidden within a few weeks and come back empty handed.

I think it should be a good first contribution. We now have all the mechanic in place. The main new think that we may need would be a generic socket blob (i.e. there is no field dedicated to sk_security in lsm_blob_sizes for now) to be able to use Landlock with other LSMs (and enable each of them to tag the same socket), but you should start without implementing this part for now.

Cheers :)

Thinking more about this new access rights, here are more thoughts:

With TCP (i.e. connected protocol), we can deny arbitrary peer from sending data to a sandboxed process if we forbid LANDLOCK_ACCESS_NET_BIND and (upcoming) LANDLOCK_ACCESS_NET_LISTEN_TCP. (In practice this can be bypassed if a (remote) peer can spoof the connection, but in this case we have a more serious issue and sandboxing cannot help.)

With UDP (and other unconnected protocol), the local port can be set with a bind() call and the destination with either sendto/sendmsg() or connect() calls. However, without explicit binding, I guess the socket will automatically be binded to a port defined by /proc/sys/net/ipv4/ip_local_port_range when not explicitly calling bind(). For instance, I think the following scenario works (but it should be tested):

  • s = socket(UDP)
  • sendto(s, port-2000) // auto-binding to port 33000 (according to /proc/sys/net/ipv4/ip_local_port_range)
  • recvfrom(s) // get data from anyone sending to port 33000

If this is correct, we should think about a complementary access right to control automatic binding with sendto().

What do you think?

Cc @sm1ling-knight @gnoack

l0kod avatar Jun 24 '24 19:06 l0kod

Hi Mickaël,

Just a follow-up as I now have a few bits of answers and I see time flew since my last message.

With UDP (and other unconnected protocol), the local port can be set with a bind() call and the destination with either sendto/sendmsg() or connect() calls. However, without explicit binding, I guess the socket will automatically be binded to a port defined by /proc/sys/net/ipv4/ip_local_port_range when not explicitly calling bind(). [...] If this is correct, we should think about a complementary access right to control automatic binding with sendto().

An implicit call equivalent to bind(0) is made when sending a first datagram, or when using connect(). I tested it, and it does indeed allow the process to receive datagrams, without any explicit bind. For example, the exact call chain can be inet_sendmsg() -> inet_send_prepare() -> inet_autobind() -> udp_v4_get_port() which assigns a port. There is no LSM hook in any of this code, so unless we add one, Landlock won't prevent any process with >= 1 UDP access right from binding to any port in /proc/sys/net/ipv4/ip_local_port_range. Since already bound ports cannot be assigned as ephemeral ports (also tested), I guess the impact is being able to snoop on traffic not directed to the host (e.g. replies for another host on a hub network, or replies in multicast/broadcast). Not great, not terrible®.

This makes me think about more appropriate access rights: LANDLOCK_ACCESS_NET_BIND_UDP (explicit call to bind()) and LANDLOCK_ACCESS_NET_SENDTO_UDP (controlling both connect() and sendto() calls).

While working on a first patch for this model, I found two usecases in which the results are sub-optimal:

  • I need to bind() to set the source port of the packets I will send (e.g. because my protocol requires it, like multicast DNS). This means I need LANDLOCK_ACCESS_NET_BIND_UDP for that port. But now I've given the program the right to receive traffic on that port, too, even if it only transmits;
  • I need to connect() to set the IP address I want to receive traffic from (e.g. a client's IP address because I create client-specific sockets over a main unconnected one). This means I need LANDLOCK_ACCESS_NET_SENDTO_UDP for that port, but now I've given the right to send traffic, even if it only receives.

Since bind() can be needed to send traffic, and connect() can be needed to receive traffic, I guess the only way to treat these two cases correctly is to have security blobs in sockets and tag them like you said. That's for another day, first I'll try to finish a V0 patch.

mtth-bfft avatar Aug 04 '24 23:08 mtth-bfft

Thanks for the heads up.

  • I need to bind() to set the source port of the packets I will send (e.g. because my protocol requires it, like multicast DNS). This means I need LANDLOCK_ACCESS_NET_BIND_UDP for that port. But now I've given the program the right to receive traffic on that port, too, even if it only transmits;

Good point. LANDLOCK_ACCESS_NET_RECVFROM_UDP might be more appropriate in this case. LANDLOCK_ACCESS_NET_BIND_UDP could still be useful for explicit binding (i.e. not ephemeral ports) to control available host-side ports and avoid service impersonation.

  • I need to connect() to set the IP address I want to receive traffic from (e.g. a client's IP address because I create client-specific sockets over a main unconnected one). This means I need LANDLOCK_ACCESS_NET_SENDTO_UDP for that port, but now I've given the right to send traffic, even if it only receives.

OK, so similarly we could also have a LANDLOCK_ACCESS_NET_CONNECT_UDP to only configure the socket with an explicit call to connect(2).

These two use cases are good examples and should be documented in a dedicated patch.

Since bind() can be needed to send traffic, and connect() can be needed to receive traffic, I guess the only way to treat these two cases correctly is to have security blobs in sockets and tag them like you said. That's for another day, first I'll try to finish a V0 patch.

I think with these four new UDP access rights we should be fine. The documentation should highlight that connect and bind for UDP is only about socket configuration, not sending nor receiving. What do you think?

Socket tagging would be an optimization for when send(2) or recv(2) is used on a configured socket instead of sendto(2) or recvfrom(2) (or similar unconnected syscalls). But yes, no need to add that for the first patch series.

About tagging:

The main new think that we may need would be a generic socket blob (i.e. there is no field dedicated to sk_security in lsm_blob_sizes for now) to be able to use Landlock with other LSMs (and enable each of them to tag the same socket), but you should start without implementing this part for now.

The infrastructure management of the sock security landed in LSM next, so it should be part of Linux 5.12 and we can use this blob in Landlock. I guess it will not be needed though, nor landlock_file(sock->file)->allowed_access.

l0kod avatar Aug 07 '24 14:08 l0kod