copyless
copyless copied to clipboard
Question: Why mem::replace?
I just ran across this and wanted to understand how this crate avoids memcpy. One thing I stumbled upon is this:
https://github.com/kvark/copyless/blob/a174875a9ac07dc7374f6b2002a083fa0764a649/src/boxed.rs#L10
Why is this using mem::replace, which seems like it'd do an unnecessary write, as opposed to let ptr = &mut *(self.0);?
Here it is with just
let ptr = self.0;
https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=e6d56b0b8587b398020085b9ec9c5130
You can see that because we never wrote null into self.0, the Drop implementation deallocates the pointer we just passed on to a Box. In this case, x in Foo::Small(x) gets set to 0 given the surrounding allocations, indicating corruption. Eventually there will be a second dealloc on the same pointer.
You could also write out literally what mem::replace effectively does, but I'm not sure how that would interact with the optimizer. You still avoid memcpy with this:
let ptr = self.0;
self.0 = ptr::null_mut();
playground::foo:
pushq %rax
movl $804, %edi
movl $4, %esi
callq *__rust_alloc@GOTPCREL(%rip)
movw $1024, (%rax)
popq %rcx
retq
That's why I think Option< NonNull<T>> should be used instead of *mut T. I'd rather have the type checker remind me of null / None handling (and in the example .take() the pointer) than doing C-like programming: see https://github.com/kvark/copyless/pull/1
Although it does generate very slightly worse code, because the NonNull constructor checks its invariant. If you're optimizing away a big memcpy that's not a huge problem.
@@ -3,6 +3,8 @@ playground::foo:
movl $804, %edi
movl $4, %esi
callq *__rust_alloc@GOTPCREL(%rip)
+ testq %rax, %rax
+ je .LBB9_1
movw $1024, (%rax)
popq %rcx
retq
Oh d'oh, I entirely missed the connection with drop. Sorry! If you don't object I could submit a PR to add a comment along those lines.
However, if it's only about drop, then why not let ptr = self.0; mem::forget(self)? Then you could even remove the conditional from drop.
@cormacrelf yep, there has to be an out-of-memory check somewhere within an allocation, that's why my allocation code aborts on the none case. NonNull::new should be a no-op (simple cast) assembly-wise, and the check should come from my checking if _.is_none() { abort() } the OOM, which has to be there
The code could indeed be further improved as @RalfJung suggested:
replace Option<NonNull<T>> by NonNull<T> altogether (thanks to the abort on null from OOM-handling), and then, on drop, use the pointer, unconditionally freeing it, thanks to a mem::forget on the init() function.
Hi everyone! Thank you for fruitful discussion here! Apparently, my watch setting was off by default, so I'm only now discovering what happened here :)