cargo-fuzz icon indicating copy to clipboard operation
cargo-fuzz copied to clipboard

Document where to save corpus

Open fitzgen opened this issue 6 years ago • 2 comments

This is something that hasn't been super clear to me, and I haven't really seen it discussed anywhere ever.

The corpus

  • can end up getting sizable (see also #163)
  • often isn't human readable

Committing it to the project-being-fuzzed's repo seems like it could add a bunch of git overhead and even make merges difficult.

But, it is needed to "pick up where you left off" when doing time-budgeted fuzzing. Anyone fuzzing the project is going to want that corpus.

I guess it could be in a git submodule? That has its own overhead, but seems like maybe a good fit for when only some folks (or just CI or something) are fuzzing, and not every local developer.

Although, maybe I'm more concerned about this than I should be?

Do folks have thoughts on this?

fitzgen avatar Nov 22 '19 18:11 fitzgen

I don't have a strong opinion, but I do think the corpus is not actually a version controlled object and should not be in your main repo (or even a submodule). Using a newer corpus on an older commit should be fine, and vice versa. The problem with merge conflicts is a manifestation of this: it's not a thing you actually want to merge carefully.

I'd recommend you store it on a server as a file, or perhaps use a separate repo that your CI's deploy step keeps up to date (the way folks do with github pages docs). Your CI can download this repo to a folder and use it.

Manishearth avatar Nov 25 '19 23:11 Manishearth

Yeah, I think keeping it in a new repo is the way to go. I'll leave this issue open for tracking documentation of that.

fitzgen avatar Dec 03 '19 19:12 fitzgen