kanal icon indicating copy to clipboard operation
kanal copied to clipboard

Access to `AsyncSignal` is not sound

Open sbarral opened this issue 2 years ago • 12 comments

I see that others have made related PRs and suggestions, but in my opinion these came short of fully solving the problem.

The issue is that the AsyncSignal::poll method takes &mut self, and at the same time AsyncSignal::send can be accessed with &self. In theory, this is in itself UB, though the compiler has escaped hatches for such cases (see below). I see that it has been suggested to use UnsafeCells to store members, but this is IMO not a solution: an UnsafeCell is only correct if you have several coexisting &self and one of them wants to access a member by mutable reference; it is still incorrect if one of them is an &mut self.

For instance, say that you store the waker in an UnsafeCell<Waker>. In the poll method, the compiler is now free to assume that you have mutable access to this UnsafeCell, but unfortunately this will also make it assume that it has automatically mutable access to the Waker. In fact, you will see that the method UnsafeCell::get_mut(&mut self) is a safe method for this very reason. So now the compiler is free to make optimizations based on the assumption that nobody reads or write the Waker at the same time, which of course is incorrect since send does access the waker concurrently.

One solution (or rather a temporary work-around until Rust has a real solution for that) is to wrap the state in an Aliasable and the waker in an Aliasable<UnsafeCell>. Unfortunately Aliasable does not exist yet in std but I would trust the pinned-aliasable crate due to the credentials of its author. Her repo is also a very good read to understand those kind of issues.

sbarral avatar Nov 01 '22 12:11 sbarral

Thanks for your report and for sharing this helpful information, After my last commit, do you still believe that access is not sound yet? I think AsyncSignal is sound because no concurrent mutable access is possible through the library, and mutable access is synchronized using State with atomic operations and the help of the Waker.

fereidani avatar Nov 01 '22 13:11 fereidani

So as discussed in the other threads, you will need to safely update the waker in AsyncSignal::poll, and then you will need to coordinate this access with the sender. This is a situation where it will be required to tell the compiler "I have a an &mut Waker access but I swear I will only use it as &Waker. After the commit, you don't update the Waker so this is not needed indeed, but not updating the waker is not correct either.

sbarral avatar Nov 01 '22 14:11 sbarral

Let's continue that subject in #14

fereidani avatar Nov 01 '22 15:11 fereidani

Fixed as discussed in #14

fereidani avatar Nov 01 '22 19:11 fereidani

Did you mean to close this one? This is orthogonal to issues #14 and #15 and is not fixed AFAICT.

sbarral avatar Nov 01 '22 19:11 sbarral

I thought that this one is fixed in your opinion too,send/recv is working only with MaybeUninit<T>, and read/write is synchronized with waker, which is a read-only variable after the signal is transferred to the opposite side of the interaction. there is no concurrent mutable access as far as I can tell.

If you think it's not sound, feel free to reopen the issue.

fereidani avatar Nov 01 '22 20:11 fereidani

Just to be clear because I don't know if this is what the misunderstanding is about: UnsafeCell does not allow concurrent mutable access, because that is always UB. What it does is it disables some compiler optimizations and tell the compiler that even though you have only shared ownership, you will sometimes upgrade it to mutable ownership but you know what you do and will never use concurrent mutable access.

Aliasable is bit different. It says: even though it looks like I have mutable access to this, in fact there may exist several shared (non-mutable) references to this object. Likewise, it does not allow you to have UB and have concurrent mutable access. It just disables optimizations that would not be sound in such context.

So when you have mutable access to something (like the Waker in poll) but sometimes you have several shared references to it and sometimes you will actually have (unique) mutable access, you actually need both. There is an example of this in the pin-list crate

https://github.com/SabrinaJewson/pin-list.rs/blob/3591d22c9f56b23d4a94c65288c953698830f73c/src/node.rs#L175-L182

sbarral avatar Nov 01 '22 21:11 sbarral

Serge, Waker is inited access only, after share it is immutable, access to the data in the MaybeUninit<T> is synchronized with the State exactly like what mutex does, so no reader exists at the same time of writing on any fields of the AsyncSignal. Read and writes are completely isolated, no reading is happening at the same time as write. Please correct me if you think there is a problem that I misunderstood.

fereidani avatar Nov 01 '22 21:11 fereidani

Sorry I haven't fully internalized the operation of kanal yet, but why can't this call to Waker::will_wake():

https://github.com/fereidani/kanal/blob/e415d692ef628ca8dfea10dc0bdc53c7b898b28a/src/future.rs#L210-L211

be made concurrently with this call to Waker::wake() in first.send() here:

https://github.com/fereidani/kanal/blob/e415d692ef628ca8dfea10dc0bdc53c7b898b28a/src/future.rs#L75-L77

OTOH, if that was the case this would be UB since the latter is a mutable access, so I am no longer sure about anything...

sbarral avatar Nov 01 '22 22:11 sbarral

The waker is cloned before calling to wake, and both waker operations are immutable access, read is not conflicting with read.

fereidani avatar Nov 01 '22 23:11 fereidani

Oh yes, so that's what I wrote earlier: sometimes the waker is accessed concurrently with shared references (clone() and will_wake()), and sometimes mutably (exclusively). This would be correct if you wrapped the waker in a Aliasable<UnsafeCell<Waker>>, but now it isn't.

Sorry but I really need to return to my other things.

sbarral avatar Nov 02 '22 08:11 sbarral

Hmm, I think there is a misunderstanding by one of us here, I'm reopening the issue until we are sure there is no UB. So I'm going to explain what I see is happening in the library:

  1. Waker is initialized mutably at the start before being shared.
  2. Waker is being shared using next_recv or next_send with the other thread.
  3. After sharing creator side have access to &mut reference but will only use it as it is & reference and another side only have access to const pointer.(edited) as you said:

So now the compiler is free to make optimizations based on the assumption that nobody reads or write the Waker at the same time, which of course is incorrect since send does access the waker concurrently.

In my understanding, it is not causing any problems. there is no concurrent read and write on Waker. there are only concurrent reads on the waker. Which compiler optimization could lead to a problem with concurrent reads?

fereidani avatar Nov 02 '22 09:11 fereidani

I close this as there is no response from the original author of the issue.

to wrap things up for those who like to know my explanation of why i think this issue is invalid:

  1. Waker is stored before being shared with other threads with the owned mutable reference.
  2. Other thread gets ownership of the signal from sync or async context and removes the signal from the waiting queue, from this point there will be no write on the waker from either side.
  3. Both the signal waiter, and the sender/receiver only read from Waker and are not updating it.

AsyncSignal is always pinned so pointer address to waker is pinned too. as there is no interior mutability required. there is no need for UnsafeCell. as there are no concurrent reads with writes there is no need for Aliasable

I'm ready to reopen the issue and continue further discussions if someone still thinks this is a bug.

fereidani avatar Nov 11 '22 12:11 fereidani