thread_local-rs
thread_local-rs copied to clipboard
A little soundness hole around drop order of `thread_local!(static …)`
Here's the code to reproduce:
use std::{cell::{Cell, RefCell}, thread};
use thread_local::ThreadLocal;
fn main() {
static FOO: ThreadLocal<Cell<i32>> = ThreadLocal::new();
// thread 1
thread::spawn(|| {
thread_local!(
static ACCESSOR: RefCell<AccessFooFromNewThreadOnDrop> =
RefCell::new(AccessFooFromNewThreadOnDrop(None));
);
struct AccessFooFromNewThreadOnDrop(Option<&'static Cell<i32>>);
// FOO.get_or_default(); // uncomment this if needed, for opposite initialization order
// of the thread-locals `ACCESSOR` and “whatever ThreadLocal
// uses internally” [I suppose `THREAD_GUARD: ThreadGuard`]
// (drop order seems different on miri vs Linux)
ACCESSOR.with(|a| { // initialize ACCESSOR in thread 1, will be dropped at end of scope below
a.borrow_mut().0 = Some(FOO.get_or_default()); // store a borrow of thread-local FOO value
});
// destructor from `thread_local` library [I suppose `THREAD_GUARD: ThreadGuard`] runs first,
// making FOO's value in thread 1, referenced in ACCESSOR, available for re-use in new threads
impl Drop for AccessFooFromNewThreadOnDrop {
fn drop(&mut self) {
// thread 2
let t = thread::spawn(|| {
// succeeds, re-using Cell<i32> value from thread 1
let r1: &Cell<i32> = FOO.get().unwrap();
r1.set(42); // write-write data race, UB
});
let r2: &Cell<i32> = self.0.unwrap();
r2.set(1337); // write-write data race, UB
t.join().unwrap();
}
}
})
.join()
.unwrap();
}
(run on rustexplorer.com [where you'll just see it compiles&runs and doesn't panic; I haven’t made the data race particularly “observable” at run-time, to keep the code more clean])
With the additional FOO.get_or_default();
line uncommented as indicated, for panic-free execution on miri, this gets reported UB as expected:
error: Undefined Behavior: Data race detected between (1) non-atomic write on thread `unnamed-1` and (2) retag write of type `i32` on thread `unnamed-2` at alloc3418. (2) just happened here
--> /home/frank/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:482:31
|
482 | mem::replace(unsafe { &mut *self.value.get() }, val)
| ^^^^^^^^^^^^^^^^^^^^^^ Data race detected between (1) non-atomic write on thread `unnamed-1` and (2) retag write of type `i32` on thread `unnamed-2` at alloc3418. (2) just happened here
|
help: and (1) occurred earlier here
--> src/main.rs:37:17
|
37 | r2.set(1337); // write-write data race, UB
| ^^^^^^^^^^^^
= help: retags occur on all (re)borrows and as well as when references are copied or moved
= help: retags permit optimizations that insert speculative reads or writes
= help: therefore from the perspective of data races, a retag has the same implications as a read or write
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE (of the first span) on thread `unnamed-2`:
= note: inside `std::cell::Cell::<i32>::replace` at /home/frank/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:482:31: 482:53
= note: inside `std::cell::Cell::<i32>::set` at /home/frank/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:411:9: 411:26
note: inside closure
--> src/main.rs:34:21
|
34 | r1.set(42); // write-write data race, UB
| ^^^^^^^^^^
The issue here is the reasoning behind the T: Send ==> ThreadLocal<T>: Sync
logic not being 100% watertight. As stated in the docs:
Note that since thread IDs are recycled when a thread exits, it is possible for one thread to retrieve the object of another thread. Since this can only occur after a thread has exited this does not lead to any race conditions.
but the “can only occur after a thread has exited” part is enforced via a thread-local (the THREAD_GUARD: ThreadGuard
one if I'm not mistaken). After that gets dropped, destructors of other thread_local!(static …);
items can still run. Because ThreadLocal
also doesn’t come with a .with(|x| …)
-style API, borrows of 'static
lifetime can be created and thus a borrow of a ThreadLocal
’s contents can be made available to such destructors after the value has already been marked as re-usable by other threads via the ThreadGuard
dropped beforehand.
I can’t really think of any particularly satisfying alternative workarounds in implementation, i.e. without significant API-change: Technically, a with(|x| …)
-style API should probably “fix” this. The get()
-style API can also be kept in addition, as long as the contained type is Sync
(because unlike std::thread_local!
, this ThreadLocal
does not suffer from issues of the contents being dropped too early, when accessed during drops of thread locals).
For putting this into perspective; going as far as spawning a thread in the destructor of a thread-local is not necessary for unsoundness. The following simpler code too reliably reports UB on current miri (the thread::sleep
is necessary to make miri execute this in a way that it finds the UB, but it should be a possible data race even without it):
use std::{cell::Cell, thread, time::Duration};
use thread_local::ThreadLocal;
static FOO: ThreadLocal<Cell<i32>> = ThreadLocal::new();
struct WriteOnDrop(&'static Cell<i32>);
thread_local!(static W: Cell<WriteOnDrop> = unreachable!());
impl Drop for WriteOnDrop {
fn drop(&mut self) {
self.0.set(0);
}
}
fn main() {
let _ = thread::spawn(|| W.set(WriteOnDrop(FOO.get_or_default())));
thread::sleep(Duration::from_millis(100));
FOO.get().unwrap().set(0);
}
Hmm, I would have expected the lock on THREAD_ID_MANAGER
to act as a synchronization point which should avoid races. Specifically:
- When
ThreadGuard
is dropped, it locksTHREAD_ID_MANAGER
and releases the thread id. - When
thread_id::get
is called on the main thread for the first time, it will lockTHREAD_ID_MANAGER
and get the same thread id that the child thread used.
Ah I see the problem now. The issue is that the lifetime returned by get()
can outlive the thread ID.
You're right, I don't really see any alternative other than a .with
-style API. This is unfortunate since it would require a new major release considering the API change.
The
get()
-style API can also be kept in addition, as long as the contained type isSync
I would like to put emphasis on this, as get() is necessary for an API I have designed using ThreadLocal, and I use T: Sync+Send.
To alleviate the need for a breaking change, I would suggest introducing with(), deprecating get(), and offering alternatives for get() for when T: Send+Sync. I'm not sure on a good naming scheme for the alternatives, however
I think get_sync
with a T: Sync
bound is sufficiently explicit. And we can deprecate the existing get
API. That neatly avoids the need for a major release.