Miri detects UB in test-suite
I tried running the arc-swap test bench under miri with the 'many seeds' feature active.
MIRIFLAGS=-Zmiri-many-seeds=0..2000 cargo miri test
It fails after a while, with the following error:
test tests_default::rcu ... error: Undefined Behavior: out-of-bounds pointer arithmetic: alloc108734005 has been freed, so this pointer is dangling
--> /home/anders/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:1690:27
|
1690 | let arc_ptr = ptr.byte_sub(offset) as *mut ArcInner<T>;
| ^^^^^^^^^^^^^^^^^^^^ out-of-bounds pointer arithmetic: alloc108734005 has been freed, so this pointer is dangling
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
help: alloc108734005 was allocated here:
--> src/lib.rs:1161:44
|
1161 | let shared = ArcSwap::from(Arc::new(0));
| ^^^^^^^^^^^
...
1236 | t!(tests_default, DefaultStrategy);
| ---------------------------------- in this macro invocation
help: alloc108734005 was deallocated here:
--> src/lib.rs:1166:60
|
1166 | shared.rcu(|old| **old + 1);
| ^
...
The above is using nightly from '2025-03-15'.
Hello.
I've noticed that too and it doesn't even need that parameter. However, so far I haven't been able to figure out the cause of that or if it is legitimate or some kind of false alarm (I won't place a bet on either at this point ‒ reading the code does not hint at how that could be possible and eg. valgrind didn't find anything, but 🤷 )
So far, I've only figured out it is somehow related to the fallbacks inside the locking strategy.
Few notes for myself / anyone also poking at it.
No concrete results, but my hunch is it is:
- Somewhere inside the src/debt/helping.rs (or whatever uses that one).
- It is triggered only when there are at least two concurrent writers (and probably some reader at the same time).
I'm still trying to lay traps, read through the code and figure out what interaction between the threads is responsible.
Hi
I've had some unreachable code that was reached (in helping.rs:299), I don't know if it's the same underlying problem (sorry I cannot share the code, but I will try to reproduce and minimize it, also maybe it's due to UB in my code)
Anyway, I've tried to see if I could help find the problem here, and I found out that:
- on my machine, running
MIRIFLAGS="-Zmiri-seed=23 -Zmiri-permissive-provenance -Zmiri-disable-stacked-borrows -Zmiri-disable-validation" cargo +nightly miri test -- tests_default::rcufails consistently with the same backtrace as above (rustc 1.89.0-nightly (45f256d9d 2025-05-27), aarch64-apple-darwin) - Changing
storage.load(Relaxed)instrategy/hybrid.rs:44tostorage.load(SeqCst)makes the error disappear
But then, after applying 2, I rerun it and found that:
test tests_default::rcu ... error: Undefined Behavior: Data race detected between (1) non-atomic read on thread `unnamed-3` and (2) deallocation on thread `unnamed-4` at alloc51714+0x10. (2) just happened here
--> /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/alloc/mod.rs:388:18
|
388 | unsafe { (**self).deallocate(ptr, layout) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Data race detected between (1) non-atomic read on thread `unnamed-3` and (2) deallocation on thread `unnamed-4` at alloc51714+0x10. (2) just happened here
|
help: and (1) occurred earlier here
--> src/lib.rs:1166:50
|
1166 | shared.rcu(|old| **old + 1);
| ^^^^^
...
1236 | t!(tests_default, DefaultStrategy);
| ---------------------------------- in this macro invocation
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE (of the first span) on thread `unnamed-4`:
= note: inside `<&std::alloc::Global as std::alloc::Allocator>::deallocate` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/alloc/mod.rs:388:18: 388:50
= note: inside `<std::sync::Weak<usize, &std::alloc::Global> as std::ops::Drop>::drop` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/sync.rs:3323:17: 3323:97
= note: inside `std::ptr::drop_in_place::<std::sync::Weak<usize, &std::alloc::Global>> - shim(Some(std::sync::Weak<usize, &std::alloc::Global>))` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:797:1: 797:56
= note: inside `std::sync::Arc::<usize>::drop_slow` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/sync.rs:1944:5: 1944:6
= note: inside `<std::sync::Arc<usize> as std::ops::Drop>::drop` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/sync.rs:2686:13: 2686:29
= note: inside `std::ptr::drop_in_place::<std::sync::Arc<usize>> - shim(Some(std::sync::Arc<usize>))` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:797:1: 797:56
= note: inside `std::mem::ManuallyDrop::<std::sync::Arc<usize>>::drop` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/mem/manually_drop.rs:256:18: 256:53
note: inside `<strategy::hybrid::HybridProtection<std::sync::Arc<usize>> as std::ops::Drop>::drop`
--> src/strategy/hybrid.rs:121:18
|
121 | unsafe { ManuallyDrop::drop(&mut self.ptr) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: inside `std::ptr::drop_in_place::<strategy::hybrid::HybridProtection<std::sync::Arc<usize>>> - shim(Some(strategy::hybrid::HybridProtection<std::sync::Arc<usize>>))` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:797:1: 797:56
= note: inside `std::ptr::drop_in_place::<Guard<std::sync::Arc<usize>>> - shim(Some(Guard<std::sync::Arc<usize>>))` at /Users/ben/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:797:1: 797:56
note: inside `ArcSwapAny::<std::sync::Arc<usize>>::rcu::<usize, {closure@src/lib.rs:1166:44: 1166:49}>`
--> src/lib.rs:617:17
|
617 | cur = prev;
| ^^^
note: inside closure
--> src/lib.rs:1166:33
|
1166 | shared.rcu(|old| **old + 1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
1236 | t!(tests_default, DefaultStrategy);
| ---------------------------------- in this macro invocation
= note: this error originates in the macro `t` (in Nightly builds, run with -Z macro-backtrace for more info)
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
So I tried making another Relaxed atomic SeqCst, and the one that solved the issue was in debt/mod.rs:76
(I don't know if that helps you but maybe it will lead to a better solution)
I have a branch that doesn't exhibit the problem. But taking bits and pieces of it and applying them independently doesn't resolve the issue so far. If you're interested, you can have a look at the miri-bug branch (disclaimer: It's a WIP branch / a mess of throwing experiments here and there).
I'd be afraid that replacing some Relaxed by SeqCst may either be the solution or it may be just a small shake in the randomization of miri that just leads to it not hitting the problem :-|. Slightly changing something does lead to one or the other error for me too. I suspect these two errors are in some way related, but I didn't have the time to dive into it further :-(.
Unfortunately, it turned out the change that made it go away in miri was when I've changed the number of parallel threads in tests. This likely (with other changes) made the thing exhibit so rarely that it didn't appear, but it's obviously not the solution. So, back to the drawing boards and keep looking and analyzing proofs.
I remember in my crates that I had to enable tree borrows, and last time I checked they wanted to move to -Zmiri-tree-borrows -Zmiri-strict-provenance. If arc-swap can achieve this I would definitely use it. Edit: I posted too quickly, I see this is probably something else unrelated to borrows but can you always add tree borrows and strict provenance or what is the reasoning for not being able to have one of these?
Would https://github.com/tokio-rs/loom help with debugging of concurrent issues like this?
If arc-swap can achieve this I would definitely use it. Edit: I posted too quickly, I see this is probably something else unrelated to borrows but can you always add tree borrows and strict provenance or what is the reasoning for not being able to have one of these?
The strict provenance is not on the master branch, since it is supposed to support ancient versions of Rust. I've tried to migrate it in the branch where I'm testing it… it was a nice exercise, but didn't do anything interesting. That is, in principle, this crates does adhere to strict provenance, it just isn't using the new APIs to make it explicit (which makes miri refuse it). As for tree borrows, it really seems to be not related to the current problem. The problem is probably some scenario of interacting of multiple (likely more than 2) threads in the same time I've overlooked.
Would https://github.com/tokio-rs/loom help with debugging of concurrent issues like this?
Considering the SeqCst is not supported there and that this crate does need SeqCst to work, I'd say it won't help much.
Overall, given how long it takes without much results, I'm considering a bigger rewrite and dropping some of the daring guarantees (specifically, not guaranteeing wait-free, but only lock-free) and dropping the support for really ancient Rust, in the hope this would significantly reduce the complexity. This overall would mean going to 2.0 version, but maybe that's the best option right now.
You can simulate SeqCst operations using loom by using SeqCst fences, like so: fence(Ordering::SeqCst); . Loom does support such fences, it just doesn't support SeqCst on regular atomic operations.
It has a performance overhead, but you could make them only be compiled in when compiling for loom.