bitcoin
bitcoin copied to clipboard
fuzz: Generate with random libFuzzer settings
Sometimes a libFuzzer setting like -use_value_profile=1
helps [0], sometimes it hurts [1].
[0] https://github.com/bitcoin/bitcoin/pull/20789#issuecomment-752961937 [1] https://github.com/bitcoin/bitcoin/pull/27888#issuecomment-1645976254
By picking a random value, it is ensured that at least some of the runs will have the beneficial configuration set.
Also, set -max_total_time
to prevent slow fuzz targets from getting a larger time share, or possibly peg to a single core for a long time and block the python script from exiting for a long time. This can be improved in the future. For example, the python script can exit after some time (https://github.com/bitcoin/bitcoin/pull/20752#discussion_r549248791). Alternatively, it can measure if coverage progress was made and run for less time if no progress has been made recently anyway, so that more time can be spent on targets that are new or still make progress.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage
For detailed information about the code coverage, see the test coverage report.
Reviews
See the guideline for information on the review process.
Type | Reviewers |
---|---|
ACK | dergoegge, brunoerg, murchandamus |
Stale ACK | darosior |
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Concept ACK.
Also, randomize -mutate_depth, for fun.
Could you expand?
Not sure. I am happy to drop this, but it was part of https://www.github.com/bitcoin/bitcoin/pull/20752/commits/1ff0dc525f051bbc7a93312dd622340ca8f4f52c, which is why I picked it up.
Not sure. I am happy to drop this
No, i was just curious about the rationale. I guess i've a mild preference for sticking to defaults if there isn't any, but i don't think it hurts either. ------- Original Message ------- On Friday, July 28th, 2023 at 11:07 AM, MacrabFalke @.***> wrote:
Concept ACK.
Also, randomize -mutate_depth, for fun.
Could you expand?
Not sure. I am happy to drop this, but it was part of https://www.github.com/bitcoin/bitcoin/pull/20752/commits/1ff0dc525f051bbc7a93312dd622340ca8f4f52c, which is why I picked it up.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Thanks, generated some data and edited the description to support the mutate_depth
choice. (Limited to [1,10] for now, but this can be changed back to [1,15], if there is at least one data point showing that values larger than 10 are useful)
I think I'd prefer continuing to use the defaults but I also don't use the runner much for generating, so no super strong opinion. I would assume that the defaults are based on data/heuristics and probably work well most of the time, so maybe make this optional?
Sometimes a libFuzzer setting like -use_value_profile=1 helps [0], sometimes it hurts [1]. By picking a random value, it is ensured that at least some of the runs will have the beneficial configuration set.
Wouldn't this only make sense if the "help" outweighs the "hurt"? For example, the metric you used in [1] is "iterations until bug found", so if randomizing the fuzzer options results in roughly equal speedup (help) and slow down (hurt) then there is no gain on average.
Except for mutate_depth
, 70% of the time the default value is used for each setting. Happy to change mutate_depth
as well, which would mean in .7**3
runs you get a "vanilla" libfuzzer.
I'd say the main goal here is to prevent a case where a bug isn't found at all (or only after a "outlier" long time) due to the default settings. I don't think we've seen such a bug in reality, so I am happy to close this pull.
Regardless, max_len
should probably be considered separate, where the goal is to reduce the storage requirements of the qa-assets fuzz inputs folder. I haven't collected any data on this either yet. Do you think it is worth it to try?
I'd say the main goal here is to prevent a case where a bug isn't found at all (or only after a "outlier" long time) due to the default settings.
This sounds reasonable but also quite rare. Could you give me an example of a theoretical target where this could happen?
Are there any other projects that do this (e.g. oss-fuzz)?
I don't think we've seen such a bug in reality, so I am happy to close this pull.
On a general note, I think it is fine to add to/improve on our fuzzing tools & techniques even if those haven't found any bugs yet but there should be some evidence that they are useful.
Regardless, max_len should probably be considered separate, where the goal is to reduce the storage requirements of the qa-assets fuzz inputs folder. I haven't collected any data on this either yet. Do you think it is worth it to try?
For max_len
we could also consider adding size guards to the targets themselves, e.g. an early if (fuzz_data_size > x) return;
which would result in fuzzers not adding (most) bigger inputs to the corpus.
Are there any other projects that do this (e.g. oss-fuzz)?
IIRC oss-fuzz did that with AFL build time settings, but removed it again because it would turn up bugs intermittently/non-deterministically. I don't think they do it for libFuzzer runtime settings, but I found this, which claims that exponential search is turned into linear search.
oss-fuzz]$ git grep -W -I use_value_profile
projects/java-example/ExampleValueProfileFuzzer.java=public class ExampleValueProfileFuzzer {
projects/java-example/ExampleValueProfileFuzzer.java- private static String base64(byte[] input) {
projects/java-example/ExampleValueProfileFuzzer.java- return Base64.getEncoder().encodeToString(input);
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java-
projects/java-example/ExampleValueProfileFuzzer.java- private static long insecureEncrypt(long input) {
projects/java-example/ExampleValueProfileFuzzer.java- long key = 0xefe4eb93215cb6b0L;
projects/java-example/ExampleValueProfileFuzzer.java- return input ^ key;
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java-
projects/java-example/ExampleValueProfileFuzzer.java- public static void fuzzerTestOneInput(FuzzedDataProvider data) {
projects/java-example/ExampleValueProfileFuzzer.java: // Without -use_value_profile=1, the fuzzer gets stuck here as there is no direct correspondence
projects/java-example/ExampleValueProfileFuzzer.java- // between the input bytes and the compared string. With value profile, the fuzzer can guess the
projects/java-example/ExampleValueProfileFuzzer.java- // expected input byte by byte, which takes linear rather than exponential time.
projects/java-example/ExampleValueProfileFuzzer.java- if (base64(data.consumeBytes(6)).equals("SmF6emVy")) {
projects/java-example/ExampleValueProfileFuzzer.java- long[] plaintextBlocks = data.consumeLongs(2);
projects/java-example/ExampleValueProfileFuzzer.java- if (plaintextBlocks.length != 2)
projects/java-example/ExampleValueProfileFuzzer.java- return;
projects/java-example/ExampleValueProfileFuzzer.java- if (insecureEncrypt(plaintextBlocks[0]) == 0x9fc48ee64d3dc090L) {
projects/java-example/ExampleValueProfileFuzzer.java: // Without --fake_pcs (enabled by default with -use_value_profile=1), the fuzzer would get
projects/java-example/ExampleValueProfileFuzzer.java- // stuck here as the value profile information for long comparisons would not be able to
projects/java-example/ExampleValueProfileFuzzer.java- // distinguish between this comparison and the one above.
projects/java-example/ExampleValueProfileFuzzer.java- if (insecureEncrypt(plaintextBlocks[1]) == 0x888a82ff483ad9c2L) {
projects/java-example/ExampleValueProfileFuzzer.java- mustNeverBeCalled();
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java-
projects/java-example/ExampleValueProfileFuzzer.java- private static void mustNeverBeCalled() {
projects/java-example/ExampleValueProfileFuzzer.java- throw new IllegalStateException("mustNeverBeCalled has been called");
projects/java-example/ExampleValueProfileFuzzer.java- }
projects/java-example/ExampleValueProfileFuzzer.java-}
there should be some evidence that they are useful.
I think I provided some evidence, but I guess you are asking about evidence that they are useful in the average case, which is a bit harder to show.
ACK fad32eb02c98d841e0cd9b585b61a698feec361a
I would use this script instead of my hacky bash script, if there were a way of running just n
random fuzz targets instead of all of them. I don’t feel like I have a good intuition on the values you’re setting, but doing something non-default in 30% of the cases look reasonable to me, based on the evidence that those settings sometimes help.
I would use this script instead of my hacky bash script, if there were a way of running just n random fuzz targets instead of all of them
like this, @murchandamus?
diff --git a/test/fuzz/test_runner.py b/test/fuzz/test_runner.py
index a60f51f40a..6510102e6c 100755
--- a/test/fuzz/test_runner.py
+++ b/test/fuzz/test_runner.py
@@ -80,6 +80,13 @@ def main():
' the given targets for a finite number of times. Outputs them to'
' the passed corpus_dir.'
)
+ parser.add_argument(
+ '--max_targets',
+ '-mt',
+ type=int,
+ default=0,
+ help='Max number of targets to run (chosen randomly). 0 means to run all of them.',
+ )
args = parser.parse_args()
args.corpus_dir = Path(args.corpus_dir)
@@ -121,6 +128,8 @@ def main():
logging.error("Target \"{}\" not found in current target list.".format(excluded_target))
continue
test_list_selection.remove(excluded_target)
+ if args.max_targets > 0:
+ test_list_selection = random.sample(test_list_selection, args.max_targets)
test_list_selection.sort()
logging.info("{} of {} detected fuzz target(s) selected: {}".format(len(test_list_selection), len(test_list_all), " ".join(test_list_selection)))
🤔 There hasn't been much activity lately and the CI seems to be failing.
If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.
Removed the max_len and mutate_depth randomization, because it seemed too controversial? Removed this from OP:
Also, randomize -max_len=
to possibly get some runs with faster iterations, or to produce smaller reduced fuzz inputs over time.
Also, randomize -mutate_depth
, as lower values seem to be beneficial as well. [2]
[2] https://github.com/bitcoin/bitcoin/pull/27888#issuecomment-1655477388
This picks up the work started in commit https://www.github.com/bitcoin/bitcoin/pull/20752/commits/1ff0dc525f051bbc7a93312dd622340ca8f4f52c
Also, added missing rss limit to avoid OOM
cc @dergoegge are you still unconvinced about this? Is there more references or data I can provide?
cc @murchandamus You may be interested in the rss_limit_mb (last commit). Not sure if you set it in your fuzz script, or if you ever ran into OOM.
cc @murchandamus You may be interested in the rss_limit_mb (last commit). Not sure if you set it in your fuzz script, or if you ever ran into OOM.
Thanks, memory has not been an issue for me.
utACK fa3a4102ef0ae06d8930d7a7b567759e2a5b5fde