wasmtime Differential fuzzing with wasm-smith isn't the most effective

Differential fuzzing with wasm-smith isn't the most effective

Open alexcrichton opened this issue 2 years ago • 14 comments

Wasmtime recently had https://github.com/bytecodealliance/wasmtime/issues/4315 filed against it which discovered that there were two separate bugs in the SIMD implementation on x86_64. This discovery comes after "months of continuous oss-fuzzing" for the simd feature. I wanted to file an issue here with some investigation of why this happened because this theoretically should not happen.

Specifically here the bug was a buggy instruction lowering (two different ones). One fix (https://github.com/bytecodealliance/wasmtime/pull/4318) surfaced by corrupting an input register which I think only causes issues if the input is attempted to be reused elsewhere (e.g. a constant reused somewhere else). I don't know precisely but my impression was that this involved some register pressure, a "big" function, and constants to line up. This specific bug I could see as very difficult to discover via wasm-smith. The second bug, however, (https://github.com/bytecodealliance/wasmtime/pull/4317) was a trivial bug in the select instruction which showed up with the smallest of tests for select. The fact that wasm-smith never discovered this is alarming to me.

Digging in it appears to be a confluence of factors which makes wasm-smith basically unable to find these bugs:

The select instruction requires 3 operands on the stack of specific types. Turns out this very rarely happens. I inserted a panic! whenever a select instruction was even considered a candidate, and it was rarely hit. Even less rarely is the instruction chosen to be emitted.
The i32 input to select I think is almost always nonzero at runtime itself. The specific bug only happened when the condition was 0, however. I think this is because a lot of i32s come from things like i32.const which is practically never zero.
Even if select is generated with v128 inputs (which happens quite rarely) it's often never actually even executed at runtime. The few test cases I found which generated this instruction immediately had infinite recursion or an infinite loop with the interesting instructions far away.

I unfortunately don't know if there's really a "fix" for issues like this. We could throw a bunch more heuristics at wasm-smith but at some point we probably need a somewhat fundamental new strategy for fuzzing here to get significantly more coverage.

Jun 25 '22 16:06 alexcrichton

I suspect that for individual instruction semantics, something like #3251 might be a reasonable solution. If we focus on generating test cases for individual Wasm instructions, we can drive the random code generation in the other direction: e.g. we have a v128.select, we know what the stack signature is, let's provide those inputs. Along the same lines, "unit-testing" individual instructions lets us more easily run through their input data-space with many test vectors (invocations) for one function.

Without coverage feedback libFuzzer may not know that some of those inputs are only sensitive to particular values (e.g. zero), but we could maybe bias probabilities too, if needed. (Custom entropy based on... similar bits? Closeness of some args to others? Surely the propcheck/quickcheck folks have some useful heuristics here...)

All the above won't give us coverage of multiple-instruction patterns (I think this objection was also raised in #3251) but unit-testing instruction lowerings seems like a sufficiently useful and unique use-case to me that it's worth its own fuzz target and approach...

Jun 26 '22 00:06 cfallin

I had forgotten about that issue! That's an excellent point though. I also filed https://github.com/bytecodealliance/wasmtime/issues/4338 as possible other wasm fuzzers we could integrate. Otherwise though other wasm-smith improvements include https://github.com/bytecodealliance/wasm-tools/issues/266.

While this may not be the most useful issue to keep open I'm tempted to leave it here as "if anyone searches the fuzzing tag in the Wasmtime repo this'll trigger them to think about improving wasm-smith"

Jun 27 '22 20:06 alexcrichton

Hey, I had completely forgotten about #3251 but I believe when I added allowed_instructions to wasm-smith it was motivated in part by that issue. The idea would be to configure wasm-smith to generate functions with, e.g., numeric and vector instructions (see InstructionKind for the categories allowed) and set max_instructions to, e.g., 3 (to test possible interactions with other instructions). With the small test case generated, we could then also randomly generate in the value space that @cfallin mentions, by randomizing the function arguments. What do you all think of that as a new fuzz target?

Jun 27 '22 20:06 abrown

We could also add a "no control flow" mode to wasm-smith, which might end up pretty similar to the "test a single instruction" thing in practice.

In fact, @abrown added the InstructionKinds filtering and we don't really use that in our fuzz targets yet. Covers a lot of the same stuff. We could actually probably use this to avoid control instructions already.

I think we also only pass zeros in as arguments to whatever functions we generate, and we could probably do much better than that as well (at least try zero, max, and one random bit pattern).

Jun 27 '22 20:06 fitzgen

@abrown jinx ;)

What do you all think of that as a new fuzz target?

Yes, definitely.

Jun 27 '22 20:06 fitzgen

One thing I think we'd also want to change about wasm-smith is the signatures of functions as well. Right now we drop all values that don't correspond to the function's type so differential fuzzing has a hard time picking these up. Ideally we want a mode where we generate the function body first and then we wrap that in a function of the appropriate type.

Jun 27 '22 20:06 alexcrichton

Hm... how? I wasn't really aware of the limitations on function signatures.

Jun 27 '22 20:06 abrown

We just choose a signature, and then generate a body. Alex is suggesting we generate a body and then derive a signature from that.

Jun 27 '22 20:06 fitzgen

Makes sense for returns, not necessarily for parameters though, since generating a body needs to know what locals are available.

Jun 27 '22 20:06 fitzgen

Or the first instruction dictates what locals must be available which should bubble up to the signature... This is the part that I'm having trouble seeing in wasm-smith. It doesn't seem designed for this kind of thing?

Jun 27 '22 20:06 abrown

Ok, how about this: we add SingleFunctionModule to wasm-smith which takes a Config, an Unstructured, and a Signature. The Signature could be something like:

pub struct Signature {
  params: Vec<ValTy>,
  returns: Vec<ValTy>,
}

ValTy would be some set-like struct that could be Arbitrary-generated and would cover the Wasm types (i32, f32, v128, etc.). So one could either manually set the parameters of the function (I think this has come up in the past as a need) OR one could generate this Signature with a range of types. (Or something like that... looking for feedback here).

The SingleFunctionModule would then... well, what would it do? Generate instructions to use all of the arguments before we get to max_instructions? We don't have to re-local.get the arguments, right?

Jun 27 '22 21:06 abrown

One idea is to perhaps:

Generate an arbitrary list of parameters
Generate an arbitrary function body using those parameters
Using what's left on the stack assign that as the return value of the function
Insert the final function type into a module

I like the idea though of using a completely new dedicated SingleFunctionModule (or similar) generator for this. That module could also have a list of values to invoke the exported function with perhaps?

Jun 27 '22 21:06 alexcrichton

So you mean "arbitrary list of parameters" not "arbitrary list of parameter types"?

Jun 27 '22 21:06 abrown

Oh sorry I meant types there, but I was thinking we could in actuality do:

generate an arbitrary list of parameter types
Additionally, afterwards, generate a list of arbitrary values to invoke the function with

whether that second step belongs in wasm-smith or wasmtime I dunno but either seems fine by me

Jun 27 '22 21:06 alexcrichton

The single-instruction generator from Andrew came out of this and nothing else has turned up in the meantime. Additionally Andrew did a lot of refactoring to have one differential fuzzer so I think we're in a better place now (hopefully). I'm going to subsequently close this.

Dec 02 '22 00:12 alexcrichton

wasmtime wasmtime copied to clipboard

Differential fuzzing with wasm-smith isn't the most effective

wasmtime
wasmtime copied to clipboard