wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

Move constant pools in compiled code out of the .text section

Open alexcrichton opened this issue 1 year ago • 4 comments

Today Cranelift's constant-pools are located in the .text section of the executable, typically located after the function itself. While convenient for code generation this exposes a possible attack vector in Wasmtime where it's trivial to put a "gadget" somewhere in memory. For example using a sequence of v128.const it would be pretty easy to assemble "machine code" at the end of a function. In the face of a bug in Cranelift this could make it possibly easier to amplify into a sandbox escape perhaps.

As a defense-in-depth measure we should try to move the constant pools out of the .text section and into a .data or otherwise read-only section. (not writable or executable). This won't be trivial to do due to the fact that relocations from the text section point at the data section and the relocation range may not always be large enough for the entire text section. Regardless though I wanted to file an issue about this idea.

alexcrichton avatar Oct 08 '24 21:10 alexcrichton

It would definitely be nice to have support for this -- in principle we could return two blobs of bytes as the result of per-function compilation instead of one, and have a relocation type that is "offset from start of this function's code to start of this function's constants".

Out of curiosity, do you happen to know how ld handles .rodata references today for very large aarch64/riscv64/... binaries? I wonder if it uses its support for relaxation (assuming most pessimistic range sequence then shrinking if able) -- it'd be unfortunate to have to use adrp/adr/ldr rather than the immediate-pcrel form of ldr for every constant. I'm not able to find anything on this at the moment...

cfallin avatar Oct 08 '24 21:10 cfallin

That's an excellent question, and one I don't know the answer to myself. I can try to play around with an assembler though and see what happens perhaps!

alexcrichton avatar Oct 08 '24 22:10 alexcrichton

I tried briefly to trigger something interesting, but got stuck at trying to get clang (on macOS/aarch64) to use the short-form LDR-with-immediate instruction; for any load from rodata it seems to use an adrp/adr pair.

For example with (separate files to avoid a neat optimization where clang const-folds the load of constant data):

% cat test.c
extern const char* s;

int foo() {
    return *((int*)s);
}

% cat data.c
const char* s = "1234";

% cat main.c
#include <stdio.h>
extern int foo();
int main() {
    printf("%d\n", foo());
}

I see _foo's body as

0000000100003f44 <_foo>:
100003f44: b0000028    	adrp	x8, 0x100008000 <_s>
100003f48: 91000108    	add	x8, x8, #0x0
100003f4c: f9400108    	ldr	x8, [x8]
100003f50: b9400100    	ldr	w0, [x8]
100003f54: d65f03c0    	ret

Maybe it wouldn't be so bad to unconditionally emit that form actually; loads from constant pools will be relatively rare. It does burn a register though to compute the address.

cfallin avatar Oct 08 '24 23:10 cfallin

Good point!

Looks like

#[no_mangle]
pub extern "C" fn foo() ->f64 {
    1.3484
}

generates

.LCPI0_0:
        .xword  0x3ff5930be0ded289
foo:
        adrp    x8, .LCPI0_0
        ldr     d0, [x8, :lo12:.LCPI0_0]
        ret

so yeah it looks like we may want to be a tiny bit clever (don't always "just" materialize the address) but otherwise looks like solving this issue would involve always doing adrp on aarch64 and the equivalent on riscv64

alexcrichton avatar Oct 09 '24 00:10 alexcrichton