cargo-fuzz icon indicating copy to clipboard operation
cargo-fuzz copied to clipboard

Add a flag to toggle -Zbuild-std, and default to using it

Open saethlin opened this issue 2 years ago • 24 comments

By default, all the debug assertions in the standard library are compiled out, and the only way to get them back is -Zbuild-std.

Even generic debug assertions can catch code that passes the sanitizers. For example, in this PR I'm trying to add more and I found a few bugs in the compiler by doing so: https://github.com/rust-lang/rust/pull/92686

Having these assertions on in general is also a very good idea, because even if the behavior would be caught eventually, catching it at the source is an aid to debugging. Example here: https://reddit.com/r/rust/comments/t87itu/how_can_i_debugreport_possible_compiler_bugs/

Turning on -Zbuild-std with debug assertions enabled introduces a performance regression. I'm trying to squint at a bunch of numbers over here, but things keep shifting around. Maybe I'm just bad at this. My best guess is that turning on -Zbuild-std is a ~20% regression. Still much better than just turning on ASan, which is a ~60% regression.

I'm also a big fan of having these on by default because these debug assertions work everywhere and with all codebases, as opposed to tools like ASan which doesn't necessarily.

saethlin avatar Mar 07 '22 22:03 saethlin

I'm not convinced that we want to do this by default, mainly because of the compile time overhead rather than run time overhead.

I think having an option makes a lot of sense, though.

fitzgen avatar Mar 07 '22 23:03 fitzgen

I agree that the compile time overhead is concerning, but cargo-fuzz already enables ASan by default, which has a dramatic effect on the compile time of large codebases.

Building an image fuzz target: No sanitizers

real	0m52.239s
user	2m0.496s
sys	0m5.073s

ASan

real	1m47.423s
user	3m46.308s
sys	0m8.821s

-Zbuild-std, no sanitizers

real	1m8.773s
user	3m13.047s
sys	0m9.173s

Building a serde-json fuzz target: No sanitizers

real	0m20.624s
user	0m29.311s
sys	0m2.069s

ASan

real	0m22.219s
user	0m35.924s
sys	0m2.225s

-Zbuild-std, no sanitizers

real	0m37.803s
user	1m44.005s
sys	0m4.873s

The checks added by ASan have not found all that many bugs in the Rust ecosystem. I suspect if/when the PR I linked lands, it will find more UB than the ASan option. So I think you're left making an awkward argument about the value of the UB that ASan may detect to continue to justify it as a default.

I am also concerned about codebases supposedly having fuzzing but missing bugs because they didn't turn on some option. We already run into this with Miri. If you say "my code passes Miri" there are a growing number of people who will say "Well did you run RUSTFLAGS="-Zrandomize-layout" MIRIFLAGS="-Zmiri-symbolic-alignment-check -Zmiri-check-number-validity -Zmiri-tag-raw-pointers" cargo miri?" Now of course those are all opt-in for a reason, but as of today I've seen them all find real bugs that otherwise go undetected.

saethlin avatar Mar 08 '22 00:03 saethlin

The sanitizer(s) are (or were in the past, at least) required for fuzzing to work at all, as the instrumentation from sanitizers was how libfuzzer would gather information from the program execution.

nagisa avatar Mar 08 '22 13:03 nagisa

When I say "no sanitizers" I mean --sanitizer=none. Even with that, cargo fuzz still passes a number of sanitizer-coverage-* flags.

Looks like coverage doesn't work, because of https://github.com/rust-lang/wg-cargo-std-aware/issues/63 sigh

saethlin avatar Mar 08 '22 14:03 saethlin

I agree that the compile time overhead is concerning, but cargo-fuzz already enables ASan by default, which has a dramatic effect on the compile time of large codebases.

This is not an argument to add even more compile time overhead. If anything it motivates disabling ASan by default (which I would be open to! but its default enabling predates my time here, so I'd want to get input on that from @Manishearth and @nagisa).

fitzgen avatar Mar 08 '22 17:03 fitzgen

iirc we enabled ASan because that was the only way at the time to get sanitizer tooling

Manishearth avatar Mar 08 '22 18:03 Manishearth

This is not an argument to add even more compile time overhead

My apologies for the confusion. I agree. I meant to point out that I think the cost and benefits of -Zbuild-std and ASan are quite comparable, so they should probably have the same default.

saethlin avatar Mar 08 '22 18:03 saethlin

Okay so my take away thus far in the discussion is that we should:

  • Have a --build-std flag in cargo fuzz that does -Zbuild-std and has an explanation in the flag's help text for why you might want to pass this flag (but also explain that it has some modest compile time overhead)

  • Switch from ASan to no sanitizer enabled by default.

Does that sound reasonable?

fitzgen avatar Mar 08 '22 18:03 fitzgen

I think so yeah. We should make sure it's easy to turn on the sanitizers and document both these options prominently

Manishearth avatar Mar 08 '22 18:03 Manishearth

Perhaps raise a warning when asan isn't enabled for a couple releases so people know what's going on?

Manishearth avatar Mar 08 '22 18:03 Manishearth

I'd be happy to see -Zbuild-std integrated somehow as well.

Personally, I think extra 30 seconds to build the standard library are probably inconsequential when the fuzz target will be running for a long time. In my eyes, most of the resources in fuzzing will be spent on actually running the fuzz target itself rather than compiling them, and so optimizing for the chance to detect an issue might make more sense?

nagisa avatar Mar 08 '22 20:03 nagisa

This also works for me; a slight worry i have is that the instrumented stdlib will slow down fuzzing itself, but I suspect it's not too bad.

Manishearth avatar Mar 08 '22 20:03 Manishearth

:facepalm: cargo-fuzz already supports passing through -Zbuild-std to Cargo, all you have to do is run cargo fuzz run fuzz-target -Zbuild-std. So it's already possible but somehow I managed to not notice this until I opened up the CLI code.

I'm still not opposed to turning off ASan by default with a warning, but should there be a second way to turn on build-std?

saethlin avatar Mar 08 '22 22:03 saethlin

In general I think we add generally useful rustc options as separate flags so they're more prominent in --help and such

Manishearth avatar Mar 08 '22 22:03 Manishearth

cargo fuzz coverage and --build-std are incompatible at the moment: https://github.com/rust-lang/wg-cargo-std-aware/issues/63. I added a check that should bail out correctly if a user attempts to use a build configuration that won't work.

saethlin avatar Mar 17 '22 00:03 saethlin

Assuming that

Personally, I think extra 30 seconds to build the standard library are probably inconsequential when the fuzz target will be running for a long time.

Is true and that enabling these flags would indeed find more issues then I don't really see a reason not to? Though, I am also assuming that it building the std is visible to the user and that sooner or later a better way to get these checks is available.

The reasoning for this is simple: It sounds like those 30 extra seconds actually save time in the long run as it means that people don't have to run it a second time because they forgot to enable them.

Also, if you go out of your way to fuzz your crate, you do it to find bugs. And though options are nice to sacrifice some detection in favor of speed, I think that in general people care more about the amount of bugs that get found by fuzzing then how long the fuzzing takes.

So, TL;DR: I vote for enabling both Asan and build-std by default but make it clear to the user that both are enabled and make it easy for the user to find out what their benefits and downsides are.

lenscas avatar Mar 17 '22 01:03 lenscas

I am also concerned about codebases supposedly having fuzzing but missing bugs because they didn't turn on some option. We already run into this with Miri. If you say "my code passes Miri" there are a growing number of people who will say "Well did you run RUSTFLAGS="-Zrandomize-layout" MIRIFLAGS="-Zmiri-symbolic-alignment-check -Zmiri-check-number-validity -Zmiri-tag-raw-pointers" cargo miri?" Now of course those are all opt-in for a reason, but as of today I've seen them all find real bugs that otherwise go undetected.

For both fuzzing and Miri, this seems like a situation where it might be good to have "profiles" to pick from, in the same sense that Cargo offers profiles of compiler options (dev/release/test/bench). These profiles could provide different sets of “good defaults” with documented intended use cases, so that a user can choose one and get good-enough results, while still allowing customization by additional flags.

This would improve on specifying individual flags because:

  • Figuring out what each flag does may be intimidating to people who have heard fuzzing or running under Miri is worth doing but aren't familiar with the theory behind the things the flags talk about. ("What's a sanitizer?")
  • Specific functionality described by flags may be added, removed, or replaced with different flags as compiler/fuzzer/miri architecture changes, and it would be easier to know how to update an old configuration if it is primarily based on a profile of documented intent, with only customizations as separate flags.

Some possible use-cases/profiles:

  • "Turn on everything!"
  • "Lots of low-hanging fruit, so please keep compile times fast while I iterate on fixing bugs"
  • "I'm testing a non-unsafe algorithm for edge cases; don't bother with sanitizers because the bugs I care about will manifest as panics"

Choosing a profile could be mandatory, so that there is no need to consider what the defaults in the absence of a profile would be.

kpreid avatar Mar 17 '22 01:03 kpreid

(for transparency, i was linked to this thread to leave a comment by saethlin)

One bug that shows up a fair bit in binary parsing that asan does catch is OOM. With asan, there's a nice backtrace leading you to the place you allocated the memory. Without, it either ignores it, or presumably just aborts on OOM. So even if asan doesn't catch any safety issues, catching OOM issues (where people read a u32 or u64 length from some file, and then do vec![0_u8; length] without validating the length actually makes sense) is very worth the compile time hit.

Though I generally do run with -snone if I know that's not an issue and I'm looking more for panics, I've not found the compile time hit of asan to be that significant.

I generally don't go out of my way to -Zbuild-std, but I've tried it now with the serde_json fuzzer and the build time is definitely what I'd call perfectly reasonable for a fuzz target, it's something I'm fine with being the default, personally.

I'd definitely appreciate some way to catch internal invariants being broken, especially since you can't fuzz with miri (imagine the executions per second :fear:), at least some debug assertions are good.


From the benchmarks above, -Zbuild-std looks negligible in terms of perf to compile time, so I think making it on by default is good. It does look to have actually a pretty significant hit to executions per second (13k or so with serde with -Zbuild-std, 25k or so without), which is what I'd be more concerned about though.

That's not too bad, but you'd definitely want to turn that off if you're fuzzing safe code where you're not looking to hit the debug assertions. So I'm going with "on by default but make it easy to turn off"

As for asan, it's much heavier a compile time hit and runtime hit (without -Zbuild-std, -snone gets around 80k, asan gets 25k as said above).

This is already easy to turn off, so I'm inclined to also say leave it on by default, making the defaults "slow but accurate" in terms of how many bugs will be caught. You'd need to leave the fuzzer on for maybe 2-8x longer than you would otherwise, but I'm personally fine with just teaching users to "pass -snone --no-build-std if you're fuzzing safe code that doesn't have a risk of OOMing".

5225225 avatar Mar 17 '22 01:03 5225225

Choosing a profile could be mandatory, so that there is no need to consider what the defaults in the absence of a profile would be.

I agree with this. No need to figure out what configuration people would most likely be using if there are no defaults. It also gives a perfect opportunity to teach the user about what options are available.

lenscas avatar Mar 17 '22 11:03 lenscas

According to this Mozilla blog post, Thread Sanitizer will not detect some of the races unless -Z build-std is used. I assume this also applies to Address Sanitizer, since some allocations are not instrumented.

I made -Z build-std mandatory for Memory Sanitizer earlier because not doing so results in false positives; but I fear for other sanitizers not having it results in false negatives. So I think what really should be discussed is whether debug assertions are enabled by default, not -Z build-std. We probably should have been enabling that all along.

This is already easy to turn off, so I'm inclined to also say leave it on by default, making the defaults "slow but accurate" in terms of how many bugs will be caught

Agreed. Fuzzing, just like cryptography, is a field where you get no feedback when you're doing something horribly wrong. It seems like everything is working, when in reality it's not. So just like cryptography, fuzzing should default to the most robust option that makes it really hard to mess up. Any optimizations should be opt-in for people who really know what they're doing.

I am reminded of CryptoCat developers infamously setting "1" as the RSA exponent because that was much faster than the alternatives, which led to the data not being encrypted at all (1 to any power is always 1).

I really like cargo-fuzz especially because it enables ASAN by default, so I can just tell anyone "use cargo-fuzz" and all the bugs will be detected, instead of saying "oh, use cargo-fuzz, and invoke it with this and that flag", and see their eyes glaze over.

Shnatsel avatar Mar 30 '22 13:03 Shnatsel

I am commenting here because the motivating PR with a bunch of debug assertions in core was just merged.

I agree very strongly with @Shnatsel's point, and I think that by default we should do the most checks possible. If it were up to me, we'd have ASan + build-std + debug_assertions on by default. If people want to turn off debug assertions or build-std, they can turn off debug assertions in Cargo.toml and they can pass --build-std=false.

I sort of agree with @kpreid's suggestion, in theory. I think that usage of cargo-fuzz is easily categorized but when it comes down to picking names for the profiles and deciding exactly what will be in them I'm at a loss. ASan, TSan, and MSan all conflict with each other, so we need either at least 4 profiles, or some mixture of profile flags and sanitizer flags, at which point we don't really have fewer flags than stipulating all the individual options. And thinking back, I've talked to a few users of cargo-fuzz who don't want to turn off ASan, even if the only unsafe code in their build is in the standard library. Whether or not this decision is well-informed is somewhat besides the point; I think some people would be bothered to learn it had been toggled off for them.

In terms of rapid iteration speed, -Zbuild-std and debug_assertions are hardly impactful. From the extra assertions that were just added, I measured a 0.5% regression on the runtime of ./x.py test. The standard library is just another dependency, it basically doesn't alter your iteration speed, only your cold build time.

saethlin avatar Apr 03 '22 18:04 saethlin

I sort of agree with @kpreid's suggestion, in theory. I think that usage of cargo-fuzz is easily categorized but when it comes down to picking names for the profiles and deciding exactly what will be in them I'm at a loss. ASan, TSan, and MSan all conflict with each other, so we need either at least 4 profiles, or some mixture of profile flags and sanitizer flags, at which point we don't really have fewer flags than stipulating all the individual options.

Ah, when I wrote that suggestion I wasn't aware there were conflicts. The first idea I think of to address that is: for an "everything" profile, why not compile the fuzz target with the incompatible sanitizers separately, and run all of them?

Support for running more than one fuzz target binary (whether in parallel or interleaved) would also benefit users with several fuzz targets who want to throw some CPU time at checking everything (after making an infrastructure change that potential affects multiple entry-points, or in CI). This is a feature I'd like to use, even independent of any question of flag-combinations.

kpreid avatar Apr 03 '22 19:04 kpreid

why not compile the fuzz target with the incompatible sanitizers separately, and run all of them?

That sounds awesome and out of scope for this PR :sweat_smile: Turning on various sanitizers requires requires different flags to cargo, so now we're talking about invoking cargo in parallel with different target directories, and also the fuzzing targets want to write to stdout/stderr so where are we going with that... yeah.

saethlin avatar Apr 03 '22 20:04 saethlin

I changed this PR back to default to turning on -Zbuild-std. I think this is the best default because it detects more bugs, which is the whole point of fuzzing. Case in point, I had to fix one of the tests because it was UB on two accounts, and defaulting to -Zbuild-std detected one kind of UB before the UB that was supposed to be detected. I expect that defaulting to -Zbuild-std will primarily detect such out-of-bounds get_unchecked usage. Though I am also aware of at least one crate that calls copy_nonoverlapping with overlapping ranges, which -Zbuild-std will also detect.

saethlin avatar Jun 09 '22 04:06 saethlin

@saethlin Address Sanitizer already detects both out-of-bounds indexing and improper use of copy_nonoverlapping, and it is already enabled by default, so I don't think this provides much benefit as a default.

Shnatsel avatar Feb 17 '23 14:02 Shnatsel

Address Sanitizer detects reads and writes which are out-of-bounds of the allocation, which is a significantly more permissive check than what the standard library requires. For example, this program executes UB which is caught by the standard library debug assertions but not caught by Address Sanitizer:

fn main() {
    let v = vec![1, 2, 3];
    let sli = &v[..1];
    let _ouch: &i32 = unsafe { sli.get_unchecked(1) };
}

And it is still not caught if we access through the reference.

Address Sanitizer also does not catch misuse of copy_nonoverlapping. It probably has support for memcpy, but copy_nonoverlapping lowers to whatever suits. For example, this program executes UB which is caught by the standard library debug assertions but not caught by Address Sanitizer:

fn main() {
    let mut v = vec![1, 2, 3];
    let ptr = v.as_mut_ptr();
    unsafe { core::ptr::copy_nonoverlapping(ptr, ptr, 1) };
}

saethlin avatar Feb 17 '23 16:02 saethlin

Superseded by #339

Shnatsel avatar May 21 '23 18:05 Shnatsel