advisory-db icon indicating copy to clipboard operation
advisory-db copied to clipboard

`safe-uninit` is unsound

Open Hezuikn opened this issue 3 years ago • 19 comments

rust playground

the above example doesnt reproduce using the actual crate in question but its "safe" wrapper is literally just: fn safe_uninit() -> Self { unsafe { MaybeUninit::uninit().assume_init() } }

Hezuikn avatar Jun 19 '22 21:06 Hezuikn

Thank you for the report!

The issue should be first reported to the developers of the crate, so that they would have the opportunity to issue a fix before the the advisory goes live: https://github.com/max-ym/safe-uninit/issues

Once the upstream issue is fixed, or it is abundantly clear that it's not going to be fixed, we can issue an advisory.

Shnatsel avatar Jun 19 '22 22:06 Shnatsel

my understanding is that its api is fundamentally flawed and cant be properly implemented(compilers fault)

Hezuikn avatar Jun 19 '22 22:06 Hezuikn

This is quite interesting. While writing the crate at the time I have believed that any such logical contradictions would not be actually possible. Even though the safe_uninit function really just creates an uninitialized value, I thought that for basic types (like integer numbers) which are valid with any possible bit-pattern in them it is still possible to safely utilize such code. In fact, this safe_uninit function (which is implemented via trait) really is usable only for some basic types and one could not "safely uninit" anything like chars or floating points.

About the code in the playground, notice that it outputs the llvm changed its mind with optimizations off. It is the compiler optimizations that produce this unexpected result. Maybe you have run example with the crate locally in the debug mode? And otherwise it would still have failed in release mode?

I don't think that it is possible to fix this in the crate itself and it seems to me that it is really some compiler issue with the optimizations. I would expect the compiler to know that with any possible random value in the uninitialized value the condition in the playground example would still be false leading to llvm changed its mind.

The disassembly of optimized variant: https://godbolt.org/z/hvfdT4qe3

max-ym avatar Jun 20 '22 08:06 max-ym

I've re-read the docs for MaybeUninit and now I spotted that there was explicitly stated that even integer types with uninit value do not have a "fixed" value and several reads can lead to a different results, which explains behavior. I found that wrapping the value in std::hint::black_box before using it actually fixes the issue. But since black_box is not stabilized, the fix can only be used in nightly.

max-ym avatar Jun 20 '22 09:06 max-ym

i feel MaybeUninit docs should be way clearer on this; black_box states explicitly that programs should not rely on it for correctness so if this works or not is an implementation detail and i fear it could be a no-op on some targets or codegen backends but still do the cursed optimizations at the same time

Hezuikn avatar Jun 20 '22 14:06 Hezuikn

r#"Note however, that black_box is only (and can only be) provided on a “best-effort” basis. The extent to which it can block optimisations may vary depending upon the platform and code-gen backend used. Programs cannot rely on black_box for correctness in any way."#

Hezuikn avatar Jun 20 '22 14:06 Hezuikn

Maybe you have run example with the crate locally in the debug mode? And otherwise it would still have failed in release mode?

no but i made it work https://github.com/Hezuikn/unsound

Hezuikn avatar Jun 20 '22 16:06 Hezuikn

The current behavior of the compiler and the reasons behind it are described here: https://www.ralfj.de/blog/2019/07/14/uninit.html

So I'm afraid the approach taken by safe-uninit is indeed inherently unsound in an optimizing compiler.

There has been a proposal to add a "freeze" operation to LLVM to set a region of memory as set to an arbitrary value, which would allow this crate to work, but AFAIK it hasn't happened even in the upstream LLVM, let alone in Rust.

Shnatsel avatar Jun 20 '22 16:06 Shnatsel

Something like this is most likely to be okay for the foreseeable future. In exchange you'd be giving up support for asmjs and wasm and any other target without an inline asm.

fn safe_uninit() -> Self {
    let mut uninit = MaybeUninit::uninit();
    unsafe {
        core::arch::asm!(
            "/* initializing {uninit} */",
            uninit = in(reg) &mut uninit,
            options(preserves_flags),
        );
        uninit.assume_init()
    }
}

dtolnay avatar Jun 20 '22 16:06 dtolnay

@dtolnay I though about trying this too but hadn't had the time today. Did it actually work out or is it just a suggestion?

max-ym avatar Jun 20 '22 17:06 max-ym

In exchange you'd be giving up support for asmjs and wasm and any other target without an inline asm.

https://doc.rust-lang.org/nightly/unstable-book/language-features/asm-experimental-arch.html

Hezuikn avatar Jun 20 '22 18:06 Hezuikn

#![feature(asm_experimental_arch)]

let mut uninit = core::mem::MaybeUninit::uninit();
let trivial: u8 = unsafe {
    core::arch::asm!(
        "/* initializing {uninit} */",
        uninit = in(local) &mut uninit,
        options(preserves_flags),
    );
    uninit.assume_init()
};

seems to work just fine on wasm32-wasi

Hezuikn avatar Jun 20 '22 18:06 Hezuikn

sym::black_box => {
    args[0].val.store(self, result);

    // We need to "use" the argument in some way LLVM can't introspect, and on
    // targets that support it we can typically leverage inline assembly to do
    // this. LLVM's interpretation of inline assembly is that it's, well, a black
    // box. This isn't the greatest implementation since it probably deoptimizes
    // more than we want, but it's so far good enough.
    crate::asm::inline_asm_call(
        self,
        "",
        "r,~{memory}",
        &[result.llval],
        self.type_void(),
        true,
        false,
        llvm::AsmDialect::Att,
        &[span],
        false,
        None,
    )
    .unwrap_or_else(|| bug!("failed to generate inline asm call for `black_box`"));

    // We have copied the value to `result` already.
    return;
}

rust/compiler/rustc_codegen_llvm/src/intrinsic.rs

Hezuikn avatar Jun 20 '22 20:06 Hezuikn

https://github.com/rust-lang/rust/pull/58363

Hezuikn avatar Jun 20 '22 20:06 Hezuikn

Anyway, I have run several (criterion) benchmarks to see if really using the black box solution has any benefits over using well-initialized memory. By black box I mean either std::hint::black_box function or the assembly above. Though the benches are very simple and probably do not reflect real world really well, black box code is actually slower than the plain zero-ed memory.

The disassembly explains it. The compiler actually initializes this memory with something in each case. In the case of zero-ed value it is 0 (obviously) and in the case with safe_uninit this is... in a simple case it is effectively anything that already was in that register before, though it takes several instructions to move the data back and forth between the stack and the register.

#![feature(bench_black_box)]

fn safe_uninit() -> u32 {
    let mut uninit = std::mem::MaybeUninit::uninit();
    unsafe {
        core::arch::asm!(
            "/* initializing {uninit} */",
            uninit = in(reg) &mut uninit,
            options(preserves_flags),
        );
        uninit.assume_init()
    }
}

pub fn a() {
    let i = safe_uninit();
    std::hint::black_box(i);
}

pub fn b() {
    let i = 0;
    std::hint::black_box(i);
}
example::a:
        push    rax
        mov     rax, rsp

        mov     eax, dword ptr [rsp]
        mov     dword ptr [rsp + 4], eax
        lea     rax, [rsp + 4]
        pop     rax
        ret

example::b:
        sub     rsp, 4
        mov     dword ptr [rsp], 0
        mov     rax, rsp
        add     rsp, 4
        ret

https://godbolt.org/z/5Yjonaffh

max-ym avatar Jun 21 '22 09:06 max-ym

The point is that possibly because of any kind of black box the micro-optimizations intended by safe_uninit crate are never performed. This leads to the though that safe_uninit itself should not ever be used and actually is a good example of how not to optimize :)

I feel that MaybeUninit docs should maybe have some links to external resources about the subject. And maybe add at least some more examples of the code that will fail badly, like in the article mentioned above. I think it is important to show at the same time exactly how it is failing and how much unexpected and counterintuitive may happen and not just state that something is "undefined behavior". Only marking this as such makes little sense to anybody not submerged into the subject without any comprehensible demonstration.

max-ym avatar Jun 21 '22 09:06 max-ym

@Hezuikn would you like to draft & send us a PR for informational = unsound on this based on above ? Cheers

pinkforest avatar Aug 27 '22 09:08 pinkforest

re: fix - I would hope that @max-ym could put a patched version out but I guess we can draft a PR for an advisory regardless

tho it sounds to me like patching seems unfeasible ?

pinkforest avatar Aug 27 '22 09:08 pinkforest

There is one potential patch that would make the API sound:

    fn safe_uninit() -> Self {
        unsafe {
            core::mem::zeroed()
        }
    }

All the types which implement SafeUninit are validly zeroable.

Nemo157 avatar Aug 17 '23 22:08 Nemo157