kafel icon indicating copy to clipboard operation
kafel copied to clipboard

__X32_SYSCALL_BIT not checked

Open mejedi opened this issue 5 years ago • 2 comments

man seccomp:

The arch field is not unique for all calling conventions. The x86-64 ABI and the x32 ABI both use AUDIT_ARCH_X86_64 as arch, and they run on the same processors. Instead, the mask __X32_SYSCALL_BIT is used on the system call number to tell the two ABIs apart.

This means that in order to create a seccomp-based blacklist for system calls performed through the x86-64 ABI, it is necessary to not only check that arch equals AUDIT_ARCH_X86_64, but also to explicitly reject all system calls that contain __X32_SYSCALL_BIT in nr.

Apparently, __X32_SYSCALL_BIT is not checked. Meaning that if a policy is compiled for x86_64, blacklists certain syscalls but the default action is ALLOW, a 32-bit caller will bypass the blacklist.

$ echo "DENY{SYSCALL[10]}DEFAULT ALLOW" | ./tools/dump_policy_bpf/dump_policy_bpf
BPF program with 7 instructions
  0: A := architecture
  1: if A != 0xc000003e goto 5
  2: A := syscall number
  3: if A < 0xa goto 6
  4: if A >= 0xb goto 6
  5: KILL
  6: ALLOW

mejedi avatar Apr 06 '19 19:04 mejedi

I encountered the same issue + the 'mirrored' one: the amd64 kernel allows not only x32 runtimes but also i386, but only one architecture is checked in BPF policy. To close this issues, I modernized the i386 and amd64 syscall set from current kernel (Debian GNU/Linux 5.8.10) and added x32 syscall set.

Next step is to define the 'companion architectures' and let the policy code generator add them in the BPF policy. x32 should operate under amd64 architecture and i386 should get another if A... clause.

basilgello avatar Sep 21 '20 19:09 basilgello

+1. We use nsjail on compiler-explorer and this currently prevents us from enabling any seccomp rules due to issues with 32bit binaries.

apmorton avatar Apr 02 '21 14:04 apmorton