Add length range to FuzzInputOptions
Closes #20816
Example test case, fuzzing f64 printing.
const std = @import("std");
const testing = std.testing;
test "fuzz f64 printing" {
const input_bytes = testing.fuzzInput(.{
.corpus = &.{std.mem.asBytes(&@as(f64, 32.0))},
.len_range = .{ .min = 8, .max = 8 },
});
var buf: [1024]u8 = undefined;
_ = try std.fmt.bufPrint(&buf, "{}", .{@as(f64, @bitCast(input_bytes[0..8].*))});
}
If an input corpus is not provided, the first run of the fuzzer is seeded with random bytes; the length of the bytes is random, but is within the min and max range specified by the len_range option.
Several invariants are enforced with assertions:
- len_range.min <= len_range.max
- The output of
Fuzzer.next()is always between the min and max length (inclusive). - the length of corpus entries is between the min and max range
I'll look at this in a couple days - I have a rather large body of work sitting in the fuzz branch right now.
No problem, I figured. As of yesterday looks like the conflicts would be pretty minimal, so hopefully it'll stay easy.
Looks like there were indeed a couple of conflicts.
Conflicts resolved :+1:
If the minimum length is greater than 0, the corpus must contain at least 1 entry, so that we can use it for the initial run/smoke test of the fuzzer.
No need, it can generate random bytes in this case.
If the minimum length is greater than 0, the corpus must contain at least 1 entry, so that we can use it for the initial run/smoke test of the fuzzer.
No need, it can generate random bytes in this case.
I considered this. The concern I had was with ownership of the memory for holding the input. Since the length range is runtime known we need to allocate it dynamically. Under normal circumstances the fuzzer owns all of the returned memory, but on the initial run it's not clear how to de-allocate the slice returned from fuzzInput. A few options I considered:
-
Use a comptime-known length range: This would force the FuzzInputOptions parameter to be comptime known. Being able to load corpus entries from a file (without '@embedFile') would be nice, which this option would preclude. The length range could also be moved to a second parameter.
-
Reorganize the fuzzer code so it is still able to manage the returned input even when not linked to libfuzzer: This might be the best option, but I haven't explored how difficult it would be. My objective was to minimize changes to the current fuzzing architecture.
-
add a 'deinit' function for the fuzzer input: totally possible, just annoying because it would do nothing while actually fuzzing; its only responsibility would be cleaning up this one allocation for the smoke test.
Does one of these (or something else I haven't considered) seem like a better approach?
Ok, a pretty easy solution by adding another global variable to track the bytes allocated for the smoke test. I think that should work.
Sounds good! Once that lands I'll take another look 👍
Looks like the updated version is now green 👍🏻
Going to close this given that #23416 is merged. I think things have changed enough that it will need to be mostly re-implemented.