cs431 icon indicating copy to clipboard operation
cs431 copied to clipboard

[Question] proper use of mem::forget in unwrap of arc value

Open thekioskman opened this issue 11 months ago • 6 comments

I am having a hard time wrapping my head around the goal of the unwrap function:

I am assuming the point is to take ownership of the inner data T (which is somewhere on the heap). I tried googling some ways to get the value without moving and ended up with some code like this: '''
let return_data = this.ptr.as_ptr().read().data;
//forget the data so we dont double drop in the test case mem::forget(this.ptr.as_ptr().read().data);

'''

Is the point that we are supposed to forget about one instance of the data because we know its going to get freed when the result of unwrap is dropped? Is this the correct way to use mem::forget?

thekioskman avatar Mar 28 '24 11:03 thekioskman

  1. try_unwrap takes an ownership of this and returns its inner data with the ownership.
  2. After the try_unwrap call, only the inner data have to be 'alive' and returned, while both this and inner of this must be dropped.
  3. From 1 and 2, you can infer that you should take an ownership of inner struct that was leaked. You can use the ownership of inner struct to return the ownership of the data inside it.
  4. mem::forget consumes ownership but it doesn't call its destructor.
  5. In the end of try_unwrap, lifetime of this ends. Since drop of Arc<T> tries to take ownership of its inner value again, you have to make it not to invoke drop. You can use mem::forget here.

I hope this helps you.

ICubE- avatar Mar 28 '24 11:03 ICubE-

I played around with the code for a while, I still don't think I understand. Is the code I posted not doing what you described? I also tried an implementation where I take the Arc using the box::from_raw implementation used in the drop function, but that led to a [STATUS_HEAP_CORRUPTION error ].

thekioskman avatar Mar 28 '24 13:03 thekioskman

The code that you presented seems somewhat weird to me. The read() function on raw pointer is marked as unsafe since it gives you ownership for free without compiler check. So, at the moment of let return_data = ..., you will have two ownerships of .data field, namely one from self and the other from read(). So, even if you call mem::forget on the data obtained by read(), self is still alive and it causes destruction of entire Arc.

In fact, you can implement correct version with Box::from_raw and mem::forget. I guess you can get understanding on correct implementation from https://github.com/kaist-cp/cs431/issues/534#issuecomment-960478314.

kingdoctor123 avatar Mar 28 '24 14:03 kingdoctor123

I'm going to elaborate a bit more onto Sunho's comment.

Here is the code you described, slightly un-inlined.

// At this point, know that `this` is unique Arc.
let return_data = this.ptr.as_ptr().read().data;
let return_data_2 = this.ptr.as_ptr().read().data;
mem::forget(return_data_2);

If we label all the resources we hold at each point in code, we get something like the following

// At this point, know that `this` is unique Arc.
// Have: Arc(x) : Arc<T>
let return_data = this.ptr.as_ptr().read().data;
// Have: Arc(x) : Arc<T>, return_data: T
let return_data_2 = this.ptr.as_ptr().read().data;
// Have: Arc(x) : Arc<T>, return_data: T, return_data_2: T
mem::forget(return_data_2);
// Have: Arc(x) : Arc<T>, return_data: T
...
// when dropping Arc(x) before returning:
// Arc(x) is changed to a x : T, and this x is dropped. Now, using return_data can lead to use-after-free.

As you can see, after you do the second read(), you have two ownership of T, which clearly isn't what you intended. What you were probably intending to do was forget the arc so that it does not drop.

When using ptr::read() you need to make sure that you are allowed to take ownership by reading from the raw pointer. The first read is (almost) fine. The second one is problematic.

Reading the docs on ptr::read should also help.

Note that you need to worry about the same things when using Box::from_raw. In particular, the following code has the same problems.

let return_data = Box::from_raw(this.ptr.as_ptr());
mem::forget(Box::from_raw(this.ptr.as_ptr()));

Lee-Janggun avatar Mar 29 '24 02:03 Lee-Janggun

OH, I get it, thanks for everyone's comments. So essentially the point is I don't run the destructor of arc<T> or else I will have two box::from_raw calls, which of course is a double free. But there is still one thing I don't understand.

So we are telling the compiler to forget arc which contains a pointer and phantom data. The pointer we took away, and phantom data has no memory, but would we not leak the memory it takes to create an empty struct (arc)?

thekioskman avatar Mar 29 '24 06:03 thekioskman

The empty struct is in the stack, hence it will be cleanup up when the function returns. Calling mem::forget() means to not call the custom drop function, which deals with the logical ownership the struct has.

Lee-Janggun avatar Mar 29 '24 06:03 Lee-Janggun