tinc
tinc copied to clipboard
Add Linux sandbox (seccomp-bpf + Landlock LSM)
Landlock is similar to unveil() and is available on Linux 5.13+.
- https://lwn.net/Articles/859908
- https://docs.kernel.org/security/landlock.html
Since both are inherited by child processes, and there's no way to disable this, we use a trick similar to what browsers are doing — fork a process early, drop privileges on the main one, and use the privileged process for running scripts.
This also lets us remove access to fork()/execve() on OpenBSD for both sandbox levels.
Hardware
Only amd64 and aarch64 are supported because I don't have access to anything else, and it's pretty dangerous to "support" other architectures without actually testing them. For example, some architectures use socketcall instead of separate accept/bind/connect/etc, some implement gettimeofday through VDSO instead of an actual syscall, and so on.
If a user wishes to go ahead anyway, they can force-enable the sandbox with:
$ meson setup build -D sandbox=enabled
Security
There is at least one way to circumvent the sandbox, but I believe it will only work with a non-empty ScriptsInterpreter. Since tincd has full access to the hosts subdirectory, the attacker can create a hosts/xxx-up or hosts/xxx-down script and ask the script worker to execute it.
It shouldn't be possible with an empty interpreter since you need to make the script executable, and both umask() and all chmod-related syscalls are blocked by seccomp.
Additionally, we remove write access to existing scripts inside hosts/ to prevent broken tincd from rewriting them and gaining shell access.
Reassigning ScriptsInterpreter to another value at runtime shouldn't be possible since script worker uses its own copies of all configuration variables, and access to other processes' memory is prevented by seccomp.
tincd also doesn't have write access to its own configuration, so it cannot rewrite the config and restart itself.
Performance
The PR doesn't seem to affect simple iperf3 between two nodes in any way. The results may be different with hundreds of nodes, but I don't have the hardware to test this.
With seccomp-bpf and Landlock
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 99.4 MBytes 834 Mbits/sec 0 3.15 MBytes
[ 5] 1.00-2.00 sec 93.8 MBytes 786 Mbits/sec 0 3.15 MBytes
[ 5] 2.00-3.00 sec 95.0 MBytes 797 Mbits/sec 4112 1.19 MBytes
[ 5] 3.00-4.00 sec 98.8 MBytes 828 Mbits/sec 0 1.30 MBytes
[ 5] 4.00-5.00 sec 98.8 MBytes 828 Mbits/sec 10 1.38 MBytes
[ 5] 5.00-6.00 sec 115 MBytes 965 Mbits/sec 653 631 KBytes
[ 5] 6.00-7.00 sec 114 MBytes 954 Mbits/sec 1 567 KBytes
[ 5] 7.00-8.00 sec 114 MBytes 954 Mbits/sec 1 501 KBytes
[ 5] 8.00-9.00 sec 115 MBytes 965 Mbits/sec 0 652 KBytes
[ 5] 9.00-10.00 sec 114 MBytes 954 Mbits/sec 5 591 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.03 GBytes 887 Mbits/sec 4782 sender
[ 5] 0.00-10.01 sec 1.03 GBytes 884 Mbits/sec receiver
Without sandbox
[ 5] local 192.168.1.1 port 57292 connected to 192.168.1.2 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 107 MBytes 899 Mbits/sec 222 420 KBytes
[ 5] 1.00-2.00 sec 98.8 MBytes 828 Mbits/sec 0 574 KBytes
[ 5] 2.00-3.00 sec 104 MBytes 870 Mbits/sec 0 701 KBytes
[ 5] 3.00-4.00 sec 106 MBytes 891 Mbits/sec 7 634 KBytes
[ 5] 4.00-5.00 sec 100 MBytes 839 Mbits/sec 3 551 KBytes
[ 5] 5.00-6.00 sec 106 MBytes 891 Mbits/sec 0 686 KBytes
[ 5] 6.00-7.00 sec 108 MBytes 902 Mbits/sec 4 619 KBytes
[ 5] 7.00-8.00 sec 108 MBytes 902 Mbits/sec 1 547 KBytes
[ 5] 8.00-9.00 sec 109 MBytes 912 Mbits/sec 0 684 KBytes
[ 5] 9.00-10.00 sec 108 MBytes 902 Mbits/sec 4 621 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.03 GBytes 884 Mbits/sec 241 sender
[ 5] 0.00-10.01 sec 1.03 GBytes 881 Mbits/sec receiver
Thanks for the review, fixed. No need to rush this, the change is quite intrusive and risky.
I'll mark this as WIP for now, best to merge this after everything else is in.
After some recent changes to debian:testing, it fails to even resolve localhost without access to NSS dynamic libraries (blocking access to nsswitch.conf doesn't help), so I had to open a lot more paths (luckily it doesn't need write access anywhere).
I'll work on test coverage a bit more before marking this as finished.