inkwell icon indicating copy to clipboard operation
inkwell copied to clipboard

Undefined behavior with `MemoryBuffer::create_from_memory_range`

Open novacrazy opened this issue 4 years ago • 5 comments

Describe the Bug:

This code will result in a panic from LLVM saying expected top-level entity from the LLVM IR.

pub static PRELUDE_LLVM_IR: &[u8] = include_bytes!("prelude.llvm");

#[inline(never)]
pub fn build_prelude<'ctx>(ctx: &JitContext<'ctx>) -> Module<'ctx> {
    let module = ctx
        .create_module_from_ir(MemoryBuffer::create_from_memory_range(
            PRELUDE_LLVM_IR, 
            "prelude"
        ))
        .unwrap();

    module
}

However, if I were to mark build_prelude as #[inline(always)], suddenly it doesn't panic. JitContext is just a small wrapper around inkwell::context::Context.

Solution

Use MemoryBuffer::create_from_memory_range_copy. Seems create_from_memory_range leaking memory causes it to become corrupted or something funky, but if it's inline then the value is kept on the stack long enough for LLVM to parse it. I don't know. I've also only just discovered this, so it's possible create_from_memory_range_copy may also have issues, but so far it at least fixes that odd behavior.

Details

I build debug with opt-level=3, so perhaps that has something to do with it.

  • LLVM Version: 10.0
  • Inkwell Branch Used: master
  • OS: Windows 10

If you're curious, here is the current prelude.llvm. It's just some intrinsic stuff with fast call.

; ModuleID = 'prelude'
source_filename = "prelude"

declare <4 x float> @llvm.fma.v4f32(<4 x float>, <4 x float>, <4 x float>) #0

; Function Attrs: nounwind inlinehint
define <4 x float> @raygon.fma.v4f32(<4 x float> %0, <4 x float> %1, <4 x float> %2) #1 {
entry:
    %res = call fast <4 x float> @llvm.fma.v4f32(<4 x float> %0, <4 x float> %1, <4 x float> %2)
    ret <4 x float> %res
}

attributes #0 = { nounwind readnone speculatable willreturn }
attributes #1 = { nounwind inlinehint }

novacrazy avatar May 02 '20 08:05 novacrazy

Good find, thanks!

TheDan64 avatar May 02 '20 20:05 TheDan64

Hello, the problem here is that this function is incredibly unsound. A MemoryBuffer provides "read-only access to a block of memory" (docs), which means that when one is made with create_from_memory_range it still points into the original source &[u8]. This means that if and when that original &[u8] is dropped, the underlying memory is invalidated and causes UB. This can also cause UB because the Rust owner of the byte slice can freely mutate the slice after they've created the MemoryBuffer while LLVM could be reading it, causing more UB. Additionally, a MemoryBuffer must be null terminated as stated in the description of MemoryBuffer, and inkwell currently does not check for this. The docs of create_from_memory_range also say that the function is "leaking memory" when it isn't, it's just allowing LLVM to look at a slice of Rust-owned memory in a read-only fashion.

Kixiron avatar Jun 11 '20 01:06 Kixiron

@Kixiron in the example above the &[u8] is 'static, so this does not hold any water.

seanyoung avatar Dec 04 '20 09:12 seanyoung

MemoryBuffer does not have any lifetime constraints and the slice it receives can have an arbitrary lifetime, which means you can trivially cause use-after-free scenarios

Kixiron avatar Dec 04 '20 12:12 Kixiron

That is true but how is that relevant for this bug?

seanyoung avatar Dec 04 '20 13:12 seanyoung