Odd behaviour with socketcall multiplexer handling
Hello kind people,
I am the main author of syd which thankfully uses libseccomp to provide a portable sandbox. In my testing I have noticed a few oddities with architectures which both have the socketcall(2) system call and newer non-multiplexed versions of the system calls as well. One example is ppc64:
$ syd-sys -a ppc64 socketcall
socketcall 102
$ syd-sys -a ppc64 send
send 334
sendto 335
sendmsg 341
sendmmsg 349
...
Now assume we want to install a portable filter that denies the MSG_OOB flag for the send(2) and recv(2) families. See the section Denying MSG_OOB Flag in send/recv System Calls on why this is relevant for a security boundary. For socketcall(2) we have no option but to divert the handling to userspace with the notify action and that's completely fine. However given you install a filter like this (excuse my rust but the idea should be fairly obvious):
if restrict_oob {
let oob = libc::MSG_OOB as u64;
for (idx, sysname) in [
"recvmsg", "sendmsg", "send", "sendto", "sendmmsg", "recv", "recvfrom",
"recvmmsg",
]
.iter()
.enumerate()
{
// MsgFlags is arg==2 for {recv,send}msg, and
// arg==3 for send/recv, sendto/recvfrom, and sendmmsg/recvmmsg.
let sys = if let Ok(sys) = ScmpSyscall::from_name(sysname) {
sys
} else {
continue;
};
let idx = if idx <= 1 { 2 } else { 3 };
let err = ScmpAction::Errno(libc::EOPNOTSUPP);
let cmp = ScmpArgCompare::new(idx, ScmpCompareOp::MaskedEqual(oob), oob);
ctx.add_rule_conditional(err, sys, &[cmp])?;
}
}
One would expect, the non-multiplexed version of the send(2) family would be included in the filter, but it is not with the latest libseccomp and our MSG_OOB tests fails on such architectures (ppc64, x86, ...) because of this.
I have also encountered a similar problem where it is not directly possible to add notify actions to the non-multiplexed versions of the socket systemcalls. That, however, was possible to workaround:
/// Insert a system call handler.
#[expect(clippy::cognitive_complexity)]
#[expect(clippy::disallowed_methods)]
fn insert_handler(
handlers: &mut HandlerMap,
syscall_name: &'static str,
handler: impl Fn(UNotifyEventRequest) -> ScmpNotifResp + Clone + Send + Sync + 'static,
) {
for arch in SCMP_ARCH {
if let Ok(sys) = ScmpSyscall::from_name_by_arch(syscall_name, *arch) {
#[expect(clippy::disallowed_methods)]
handlers
.insert(
Sydcall(sys, scmp_arch_raw(*arch)),
Arc::new(Box::new(handler.clone())),
)
.unwrap();
} else {
info!("ctx": "confine", "op": "hook_syscall",
"msg": format!("invalid or unsupported syscall {syscall_name}"));
}
// Support the new non-multiplexed ipc syscalls.
if IPC_ARCH.contains(arch) {
let sys_ipc = match syscall_name {
"shmat" => Some(397),
"msgctl" => Some(402),
"semctl" => Some(394),
"shmctl" => Some(396),
"msgget" => Some(399),
"semget" => Some(393),
"shmget" => Some(395),
_ => None,
};
if let Some(sys) = sys_ipc {
#[expect(clippy::disallowed_methods)]
handlers
.insert(
Sydcall(ScmpSyscall::from(sys), scmp_arch_raw(*arch)),
Arc::new(Box::new(handler.clone())),
)
.unwrap();
continue;
}
}
// Support the new non-multiplexed network syscalls on MIPS, PPC, S390 & X86.
let sys = match *arch {
ScmpArch::M68k => match syscall_name {
"socket" => 356,
"bind" => 358,
// no accept on m68k.
"accept4" => 361,
"connect" => 359,
"getpeername" => 365,
"getsockname" => 364,
"getsockopt" => 362,
"recvfrom" => 368,
"sendto" => 366,
"sendmsg" => 367,
"sendmmsg" => 372,
_ => continue,
},
ScmpArch::Mips | ScmpArch::Mipsel => match syscall_name {
"socket" => 183,
"bind" => 169,
"accept" => 168,
"accept4" => 334,
"connect" => 170,
"getpeername" => 171,
"getsockname" => 172,
"getsockopt" => 173,
"recvfrom" => 176,
"sendto" => 180,
"sendmsg" => 179,
"sendmmsg" => 343,
_ => continue,
},
ScmpArch::Ppc | ScmpArch::Ppc64 | ScmpArch::Ppc64Le => match syscall_name {
"socket" => 326,
"bind" => 327,
"accept" => 330,
"accept4" => 344,
"connect" => 328,
"getpeername" => 332,
"getsockname" => 331,
"getsockopt" => 340,
"recvfrom" => 337,
"sendto" => 335,
"sendmsg" => 341,
"sendmmsg" => 349,
_ => continue,
},
ScmpArch::S390X | ScmpArch::S390 => match syscall_name {
"socket" => 359,
"bind" => 361,
// no accept on s390x.
"accept4" => 364,
"connect" => 362,
"getpeername" => 368,
"getsockname" => 367,
"getsockopt" => 365,
"recvfrom" => 371,
"sendto" => 369,
"sendmsg" => 370,
"sendmmsg" => 358,
_ => continue,
},
ScmpArch::X86 => match syscall_name {
"socket" => 359,
"bind" => 361,
// no accept on x86.
"accept4" => 364,
"connect" => 362,
"getpeername" => 368,
"getsockname" => 367,
"getsockopt" => 365,
"recvfrom" => 371,
"sendto" => 369,
"sendmsg" => 370,
"sendmmsg" => 345,
_ => continue,
},
_ => continue,
};
handlers
.insert(
Sydcall(ScmpSyscall::from(sys), scmp_arch_raw(*arch)),
Arc::new(Box::new(handler.clone())),
)
.unwrap();
#[expect(clippy::arithmetic_side_effects)]
if matches!(*arch, ScmpArch::Mips | ScmpArch::Mipsel) {
// This is a libseccomp oddity,
// it could be a bug in the syscall multiplexer.
// TODO: Investigate and submit a bug report.
handlers
.insert(
Sydcall(ScmpSyscall::from(sys + 4000), scmp_arch_raw(*arch)),
Arc::new(Box::new(handler.clone())),
)
.unwrap();
}
}
}
Admittedly, it's a bit annoying to hardcode all these but it works.
I do not know whether this oddity is a bug but I would expect the socketcall(2) and ipc(2) multiplexing handling in libseccomp to handle these behind me. Is this possible? Thank you in advance.