log
log copied to clipboard
Relax atomic orderings
Currently this crate uses SeqCst orderings for all atomic operations related to logger initialization. At most these orderings would need to be AcqRel since they only involve a single atomic. This PR relaxes all orderings as much as possible without introducing visibility issues.
I also changed the initialization of the filter to be LevelFilter::Off as usize for clarity.
Closes #509
In talking with some other developers some concern was presented over whether or not a write to LOGGER could be re-ordered before a relaxed atomic load despite that write being conditional. This commit defensively changes the ordering to Acquire to ensure this does not happen, but if it is guaranteed that such a reordering could not occur with Relaxed then that commit can be reverted.
I'm not sure that this new ordering is valid, but does it really matter? The logger is set at most once, usually during startup, personally I don't think the performance concerns at that stage outweigh getting this wrong...
The initialization of the logger wasn't really of concern here, the SeqCst load in the logger function was. I did a couple benches and didn't see a significant change in performance.
A couple remarks on that: this doesn't change a whole lot. Using SeqCst everywhere is analogous to using a RwLock for a single thread instead of a RefCell. Will it have a noticeable impact on performance? Probably not, but it's wholly unnecessary. Again, since this is a single atomic variable, you could change every SeqCst ordering to AcqRel/Acquire/Release and it would be exactly equivalent. SeqCst only causes logically different behavior when it synchronizes with atomic operations on different memory locations or SeqCst fences.
If the primary concern here is correctness over performance, then I'd personally suggest not using a home-brew Once. For instance the unbounded spinning in set_logger_inner is almost certainly the wrong choice. OnceCell and other crates provide #![no_std] battle-tested implementations of this pattern with minimal but sufficient orderings, as well as better unsafe hygiene that doesn't involve static mut.
If you want to close this that's fine. My two cents is that in its current state the once implementation here is non-optimal (and this PR doesn't fix the aforementioned unbounded spin loop), and that if you want correctness then a better solution would be to rely on a 3rd party once crate with more thorough auditing than to simply make all orderings SeqCst.
EDIT: I forgot to mention, if the correctness of this implementation was of primary concern, then I could add loom tests which would give pretty substantial reassurance that the orderings are correct.
@Cassy343 I'm familiar with atomic ordering, that wasn't my point. My point was that the ordering is tricky to get right (even if you following docs precisely) and that I don't think it's worth it for the setup phase of a program.
So, I propose to reduce this pr to just change the atomic ordering of the logger function as that's really the only one where performance matters.
Regarding the ordering logger I think we can lower it all the way to Relaxed, iff we properly document it. I imagine (but have no data to backup this statement) that most multi-threaded applications will setup logging before starting threads. Given that, changing logger to Relaxed shouldn't be an issue. One exception might be testing, where the creation of the test threads isn't handled by the developer (assuming the default testing harness). But again make a reasonable assuming that all tests that need logging will attempt to set it up, thus calling something that loads using SeqCst again, I don't think this will be an issue.
That ordering should not be reduced to relaxed. Even if we properly document it, if one thread initializes the logger and another concurrently tries to access the logger, then that's undefined behavior which is not allowed to occur in safe code, so that function would need to be marked as unsafe.
I'd also wager that a non-trivial number of users are initializing the logger after spinning up threads due to libraries such as tokio and rocket providing macros that wrap your main and spin up a thread pool before executing your code. An acquire load is the minimally sufficient ordering to make that function safe provided that, after LOGGER is written to, STATE is set to INITIALIZED with release ordering or stronger.
That ordering should not be reduced to relaxed. Even if we properly document it, if one thread initializes the logger and another concurrently tries to access the logger, then that's undefined behavior which is not allowed to occur in safe code, so that function would need to be marked as unsafe.
How do you think this is possible?
I'd also wager that a non-trivial number of users are initializing the logger after spinning up threads due to libraries such as tokio and rocket providing macros that wrap your
mainand spin up a thread pool before executing your code. An acquire load is the minimally sufficient ordering to make that function safe provided that, afterLOGGERis written to,STATEis set toINITIALIZEDwith release ordering or stronger.
I disagree. I think if you put a atomic::fence(SeqCst) at the bottom of set_logger and use STATE.load(Relaxed) == INITIALIZED it not possible for a thread to read state == INITIALIZED with the logger not set.
I disagree. I think if you put a atomic::fence(SeqCst) at the bottom of set_logger and use STATE.load(Relaxed) == INITIALIZED it not possible for a thread to read state == INITIALIZED with the logger not set.
That is not what fence(SeqCst) does. With all accesses being Relaxed, there is no acquire operation for the release part of the fence to attach to. If there was, fence(Acquire) would be sufficient, but there isn't so there isn't a correct option.
I agree with @Cassy343's PR, and assesment that going to relaxed introduces undefined behaviour in various cases of safe use of the function.
To add on to @chorman0773's point, here is a (correctly failing) loom test demonstrating that Relaxed ordering causes UB:
fn main() {
use loom::{sync::atomic::{AtomicUsize, Ordering, fence}, cell::UnsafeCell, thread};
use std::sync::Arc;
loom::model(|| {
let number = Arc::new(UnsafeCell::new(0i32));
let state = Arc::new(AtomicUsize::new(0));
thread::spawn({
let number = Arc::clone(&number);
let state = Arc::clone(&state);
move || {
if state.load(Ordering::Relaxed) == 2 {
number.with(|ptr| assert_eq!(unsafe { *ptr }, 1));
}
}
});
thread::spawn(move || {
match state.compare_exchange(0, 1, Ordering::SeqCst, Ordering::SeqCst) {
Ok(0) => {
number.with_mut(|ptr| unsafe { *ptr = 1 });
state.store(2, Ordering::SeqCst);
},
_ => unreachable!()
}
fence(Ordering::SeqCst);
});
});
}
@Cassy343 I don't use Loom, but that's not exactly what I had in my. I was thinking the following.
fn main() {
use loom::{sync::atomic::{AtomicUsize, Ordering, fence}, cell::UnsafeCell, thread};
use std::sync::Arc;
loom::model(|| {
let number = Arc::new(UnsafeCell::new(0i32));
let state = Arc::new(AtomicUsize::new(0));
thread::spawn({
let number = Arc::clone(&number);
let state = Arc::clone(&state);
move || {
if state.load(Ordering::Relaxed) == 2 {
number.with(|ptr| assert_eq!(unsafe { *ptr }, 1));
}
}
});
thread::spawn(move || {
match state.compare_exchange(0, 1, Ordering::SeqCst, Ordering::SeqCst) {
Ok(0) => {
number.with_mut(|ptr| unsafe { *ptr = 1 });
fence(Ordering::SeqCst); // Sync the write above.
state.store(2, Ordering::SeqCst);
},
_ => unreachable!()
}
});
});
}
But I think you guys are right and we still need something this proposed fence can sync with on the reading side (the log calls).
I'm not sure that this new ordering is valid, but does it really matter? The logger is set at most once, usually during startup, personally I don't think the performance concerns at that stage outweigh getting this wrong...
The impact is not limited to the startup phase, incurring unnecessary overhead every time the log!() is used, and SeqCst is a very expensive operation on some platforms.
pub fn logger() -> &'static dyn Log {
if STATE.load(Ordering::SeqCst) != INITIALIZED {
Thanks for the PR @Cassy343! I think one of your earlier points about not using a home-grown once cell for this is pretty valid, and once OnceLock stabilizes we should move to it. In the meantime, synchronizing on the global logger is almost the only thing this library actually does, so it seems fair that we should try do a good job of it. Maybe we could move this synchronization code into its own sub-module and add a loom test for it?