nix
nix copied to clipboard
`sched::clone` doesn't pass ownership of closure to new thread.
The current implementation of of sched::clone
doesn't pass ownership of the the closure called by callback
into the newly created thread, only a reference. This means that when the call to clone
ends, the callback is actually dropped, releasing all associated memory. This most likely means that the closure that actually gets run in the child thread is a reference to invalid memory when it is run.
Recommended fix is to unbox the CloneCb
into a raw pointer and box it back up again in callback
. This would mean that the closure would be dropped at the end of the callback in the child thread, rather than at the end of the call to clone
.
I'll have a patch in a few minutes.
Without https://github.com/rust-lang/rust/issues/28796 being stabilised, sched::clone
shouldn't support actual closures at all. It makes no sense for clone
to accept a closure with references as the child thread then assumes that everything being referenced lives longer than itself and the parent thread assumes that all borrows that the child thread had were dropped when the call to clone
succeeded.
In light of this, I think I'll make a patch to change clone to accept something more like what the syscall does now (i.e. fn(T) -> isize
with a T
passed into clone.
This is still a problem, and makes clone completely broken (rather than the expected horribly unsafe because it's clone). FnBox has been fixed for a while, though it would be a breaking change to improve CloneCb
to be Box<FnOnce()>
instead of Box<FnMut()>
. That said, fixing the other correctness issues there to make it Box<FnMut() -> isize + Send + 'static>
instead would also be breaking, so might as well make it FnOnce() at the same time.
On top of that &mut [u8]
is not a great choice for the stack type, as you need the borrow for up to 'static, not just for the duration of clone(), and the most obvious ways to allocate [u8; stack size] will break in practice.
@talchas would you like to try rewriting clone? I wouldn't worry about preserving backwards-compatibility; it probably can't be done in a way that also fixes clone's problems.
I'd go with
pub fn clone<F: FnOnce() -> c_int + Send + 'static>(
cb: F,
stack: *mut [u8],
flags: CloneFlags,
signal: Option<c_int>,
) -> Result<Pid> {
extern "C" fn callback<F: FnOnce() -> c_int + Send + 'static>(data: *mut c_void) -> c_int {
let cb: Box<F> = unsafe { Box::from_raw(data as *mut F) };
(*cb)()
}
let cb = Box::into_raw(Box::new(cb));
let res = unsafe {
let combined = flags.bits() | signal.unwrap_or(0);
let ptr = (stack as *mut u8).add((*stack).len());
let ptr_aligned = ptr.sub(ptr as usize % 16);
libc::clone(
callback::<F>,
ptr_aligned as *mut c_void,
combined,
cb as *mut c_void,
)
};
let res = Errno::result(res).map(Pid::from_raw);
// In CLONE_VM we need to free the callback on both sides
if flags.contains(CloneFlags::CLONE_VM) || res.is_err() {
unsafe { Box::from_raw(cb); }
}
res
}
but there's a lot of fairly reasonable bikeshed options: taking *mut F
(and probably having users just leak it or do it wrong) or Box<FnOnce() -> c_int + Send + 'static>
, maybe taking Box<[u8]>
for stack. Then there's the option of not doing closures at all since they're still kinda sketchy in CLONE_VM (anything calling into the allocator in the new mostly-process is), though then the benefit vs libc is small.
- Taking raw pointers isn't much better than having the user call FFI functions directly. Could
stack
be a&'static mut [u8]
instead? Granted, it would probably have to be unsafely created. Or if you takeBox<[u8]>
for the stack, who would be responsible for deallocating it, the parent, the child, or both? - What's the point of taking a closure argument if it must be
'static
? Is that any different than taking a plain function pointer? If there is a good reason to use a closure argument, then yourBox::into_raw
logic looks correct. - I think your logic for CLONE_VM is backwards. The parent should free the callback, unless that flag is set.
- Finally, this function should definitely be
unsafe
.
Yeah, I didn't do &'static because it would need unsafe anyways, and didn't do Box<[]> because of the freeing issues, and statically allocating a buffer is maybe useful sometimes.
You still want a closure, even if it's 'static; you can move in a String/ints/anything else owned still.
Yes, I forgot which way around CLONE_VM worked, and making it unsafe is probably good too.
Hm. The child process/thread thing will never free stack, and unless CLONE_VM
is set then the parent can do whatever it wants with the stack after return. So maybe it doesn't need to be 'static
. Would it be better to make stack a &mut [u8]
, with a note in the documentation that it must outlive the child process if CLONE_VM
is set? Or should we remove CLONE_VM
from CloneFlags
, and create a separate clone_vm
function that requires a static array for stack
?
Having a separate CLONE_VM certainly could be reasonable, as the mechanics around memory safety and lifetimes are completely different. For non-VM ones you can drop the 'static requirement, use &mut [u8], and use &mut cb rather than boxing it (but that one definitely should be unsafe
for similar reasons to CommandExt::pre_exec).
Looking more at clone(2), I guess there's also the issue that all the SETTID/CLEARTID/PIDFD stuff causes it to write to the varargs slots which there is no way to fill in, so setting those flags has no use other than being UB. Similarly, SETTLS can't be specified, and I guess would be reason enough to make clone_vm() be unsafe
as well, since tls is assumed to be set up properly for safe code.
Yeah, good catch. You should remove those flags too, or create separate functions to enable their use.