ring Adding tests and fuzzing to der.rs (using cargo-fuzz)

Here is an experiment to add fuzzing to der.rs. Bunch of thoughts and notes:

It looks like there are two popular options for fuzzing in Rust: cargo-fuzz and afl.rs. I picked the first one, not for any specific reason. It may be interesting to try to implement the same with afl.rs to see what the experience is.
This was easy to setup but it does require Rust Nightly tooling on Intel. (No M1 support - sad)
I implemented a super basic fuzzer for fuzz_bit_string_with_no_unused_bits which may not be the best function to fuzz. I'll add some more der.rs coverage in the coming days.
Unclear that corpus is thrown at this code. It was my understanding that fuzzing could also use known good input as a starting point instead of just throwing random data at it. Both strategies are useful I think and I will explore that a bit further.
With cargo-fuzz, fuzzers are small standalone binaries. You basically implement a single function that the fuzzer tooling with turn into a binary that can then run in the fuzzing framework.
Simply running cargo fuzz run fuzz_bit_string_with_no_unused_bits is very nice and easy to automate
I had hoped you could combine test_* and fuzz_* functions easily in a module's test section, but that is not possible. It is definitely possible to extract common code and share that between fuzzing and unit testing. (Although I am not familiar enough with Rust yet to understand how that would work with public/private and test namespaces.)

To run this:

rustup override set nightly
cargo install cargo fuzz
cargo fuzz run fuzz_bit_string_with_no_unused_bits

Let me know what you think @briansmith .. I'll play around with this a bit more. No expectation that this PR will land - consider it an experiment.

I'm going to try to be more creative with fuzzing in Ring, but I think the more interesting fuzzing should be done in WebPKI - the primitives here are probably good to be fuzzed but I have a gut feeling that it is not going to give interesting results. Fuzzing larger parsers like in WebPKI is a lot more interesting I think.

Apr 23 '21 15:04 st3fan

I tried to run this on an M1 Mac Mini but I don't think that is supported yet by this tooling:

libFuzzer needs LLVM sanitizer support, so this only works on x86-64 Linux and x86-64 macOS for now. This also needs a nightly compiler since it uses some unstable command-line flags. You'll also need a C++ compiler with C++11 support.

I'm getting a:

error: the option `Z` is only accepted on the nightly compiler

when running cargo fuzz run ... but I think what it really means is that this option is not available on aarch64. (I'm using the same version, rustc 1.53.0-nightly (7f4afdf02 2021-04-22), between my Intel Air and M1 Mini)

Maybe a speedy Linux machine is good for fuzzing.

Apr 23 '21 15:04 st3fan

Hi, I'll take a look at this really soon. I just wanted to jot down my initial thinking:

My current thoughts are to try to follow the pattern suggested in https://alastairreid.github.io/papers/hatra2020.pdf:

Define most tests as parameterized test functions: fn test_foo(test_case: FooTestCase) { ... }.
Have a single (probably) #[test] wrapper like this:

#[test]
fn test_foo_kat() { // `kat` is jargon for "known answer test"
    const TEST_CASES: &[FooTestCase] = &[...] {
    };
    TEST_CASES.for_each(test_foo)
}

Also use a property-based testing/fuzzing framework to generate random inputs for the parameterized test function. Depending on the testing library, this could be done by having the FooTestCase implement Arbitrary or similar.
Eventually replace the property-based testing with verification (proving) with minimal changes.

Apr 28 '21 02:04 briansmith

This was easy to setup but it does require Rust Nightly tooling on Intel. (No M1 support - sad)

Rust nightly isn't the end of the world for an initial stab, though we should investigate options that are most likely to work on stable.

I implemented a super basic fuzzer for fuzz_bit_string_with_no_unused_bits which may not be the best function to fuzz. I'll add some more der.rs coverage in the coming days.

Let's consider different goals of the fuzzing:

Verify that no matter what, the function never panics or crashes. That seems to be mostly what you're doing in this draft.
No matter the values of the encoded bits, any non-zero value for the "unused bits" part is rejected. For this, we'd need some scaffolding to construct an input based on two parameters: the value of the "unused bits" field (a small integer) and the value of the bitfield (a slice of bytes?). Then the fuzz tests asserts that if the unused bits field is nonzero then the result is always Err.

Unclear that corpus is thrown at this code. It was my understanding that fuzzing could also use known good input as a starting point instead of just throwing random data at it. Both strategies are useful I think and I will explore that a bit further.

Not just known good input, but known inputs with known expected results. I suggest we try to find an approach that lets us use your manually-written known-answer tests as inputs to the fuzzer.

With cargo-fuzz, fuzzers are small standalone binaries. You basically implement a single function that the fuzzer tooling with turn into a binary that can then run in the fuzzing framework.

If we have to do that, we can. But it would be better, perhaps, to use a framework that lets us define a bunch of fuzzers grouped together in a single file with less boilerplate.

I'm going to try to be more creative with fuzzing in Ring, but I think the more interesting fuzzing should be done in WebPKI - the primitives here are probably good to be fuzzed but I have a gut feeling that it is not going to give interesting results. Fuzzing larger parsers like in WebPKI is a lot more interesting I think.

For fuzzing webpki I suggest you take a peek at what Google has done for fuzzing their X.509 library. I know they had to add a way to disable signature verification because otherwise the fuzzer doesn't make enough progress since it will literally never guess a valid signature.

Apr 28 '21 02:04 briansmith

Also, thanks for working on this!

I would suggest a humble project: Factor out the code that can construct a Tag-Length-Value encoding from a (tag, value) pair. Test that parsing such an encoding succeeds when the tag matches and the length is <= the maximum supported length, and returns Err otherwise, and never crashes or panics.

Apr 28 '21 02:04 briansmith

I am curious about your thoughts on https://github.com/AltSysrq/proptest for fuzzing compared to the others you found. Maybe I'm just too biased towards it do to what I've researched so far, but it does seem like a great fit.

Apr 28 '21 02:04 briansmith