go-fuzz icon indicating copy to clipboard operation
go-fuzz copied to clipboard

load balance between multiple fuzz functions

Open josharian opened this issue 6 years ago • 15 comments

Juggling multiple fuzz functions is an annoyance.

Currently you have to compile O(n) times, track O(n) zip files, and run O(n) go-fuzz instances using O(n) distinct workdirs.

I'd rather compile once, identifying a set of fuzz functions, and then have go-fuzz do some balancing between different fuzz functions (equal execution time? equal number of iterations?), using subdirs of a single parent workdir.

I'm game to try implementing this, but I wanted to solicit feedback first.

Update: support for multiple fuzz functions in a single binary is implemented and it can be selected at start. However, we still fuzz only 1 function at a time.

josharian avatar Mar 29 '19 00:03 josharian

(I assume all fuzz functions must be within a single package for this.)

josharian avatar Mar 29 '19 00:03 josharian

I have been thinking a bit about this recently in the context of golang/go#19109.

Especially as fuzz functions hopefully become more common, it would be quite nice to be able to do something like go test -fuzz=. ./... and have that hit N fuzz functions across M packages.

golang/go#19109 lives in a hopefully slightly happier future where instrumentation and compilation times are much better (perhaps through some combination of better compiler integration, better caching, and/or other magic), and at least in that future world, it would be nice if you could hit ctrl-C say 5 minutes after go test -fuzz=. ./... or after letting it run overnight, and in both cases have some reasonable expectation of fuzzing time having been split at least semi-equitably across your N fuzz functions. Perhaps the initial round could have a fairly small time quantum per fuzz function, and then perhaps increase the time quantum in subsequent rounds, or something like that?

Returning more to the present, if you were to pursue something like you suggest for go-fuzz, what would you think about having -workdir default to pkgpath/testdata/fuzz/<fuzzname>, which is what golang/go#19109 suggests as the default value for a future -fuzzdir?

thepudds avatar Mar 29 '19 02:03 thepudds

We could do multi-package fuzz functions, but the implementation is a fair bit hairier. Maybe it is worth leaping there, but since I am doing this as a side project, I’d rather go incrementally. The first step is all in one package. That’s complicated enough, honestly.

The default workdir sounds fine, sure.

josharian avatar Mar 29 '19 04:03 josharian

It can make sense to start with allowing to select the fuzz function after building the package for fuzzing, namely as go-fuzz flag. Could we collect all possible fuzz functions keyed by names and store in a map, and then pass the one we want to run from go-fuzz? This would allow to build other things on top, but also useful in itself.

dvyukov avatar Mar 29 '19 06:03 dvyukov

Good idea, Dmitry. I’ll start there.

josharian avatar Mar 29 '19 15:03 josharian

Hi @josharian, sorry, my comment was not very clear. I did not mean to suggest that I think you should do additional work now beyond what you were contemplating. I was more trying to comment on how it could work if golang/go#19109 happens and if that proposal were to change to support multiple fuzz targets. And I agree moving incrementally here in this issue is a very good path forward.

As I mentioned, I had been looking at this multi-target question in the context of golang/go#19109, and it was a small change to add support for multiple fuzz targets to the simple fzgo prototype.

Of the issues you mentioned in your first comment here:

Currently you have to compile O(n) times, track O(n) zip files, and run O(n) go-fuzz instances using O(n) distinct workdirs.

It takes a simple approach in fzgo and it still compiles O(n) times, but it automatically manages the three other pieces you mentioned there when it matches multiple fuzz targets.

Here is a sample run. It starts with a small-ish time quantum of 5 sec for each fuzz target, and then doubles that every round until a max of 5 min, and runs forever (unless you specify the optional -fuzztime argument). If go-fuzz ends up implementing what you suggested, it could perhaps start with a smaller time quantum (e.g., in case there are many fuzz targets), but in this simple approach fzgo is trying to allow the first status update from go-fuzz, which is usually after about 3 seconds.

$ fzgo test -fuzz=. ./examples/...

fzgo: using cached instrumented binary for sample.FuzzModfile
fzgo: using cached instrumented binary for sample.FuzzSemver
fzgo: using cached instrumented binary for fuzztime.FuzzTime

fzgo: starting fuzzing sample.FuzzModfile
2019/03/29 15:24:15 workers: 4, corpus: 452 (0s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 0, uptime: 3s
fzgo: starting fuzzing sample.FuzzSemver
2019/03/29 15:24:21 workers: 4, corpus: 140 (3s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 0, uptime: 3s
fzgo: starting fuzzing fuzztime.FuzzTime
2019/03/29 15:24:26 workers: 4, corpus: 3 (3s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 0, uptime: 3s
fzgo: starting fuzzing sample.FuzzModfile
2019/03/29 15:25:01 workers: 4, corpus: 457 (0s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 0, uptime: 3s
2019/03/29 15:25:04 workers: 4, corpus: 460 (2s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 882, uptime: 6s
2019/03/29 15:25:07 workers: 4, corpus: 465 (0s ago), crashers: 0, restarts: 1/6751, execs: 128285 (14086/sec), cover: 882, uptime: 9s
fzgo: starting fuzzing sample.FuzzSemver
^C
2019/03/29 15:25:46 shutting down...

In any event, that is much more focused on golang/go#19109, and less on the issue at hand, so I'll stop there.

Anything you do here in this issue for go-fuzz I am sure would be much appreciated by current go-fuzz users, just like there is already much appreciation for the performance improvements, CLI simplification, etc., that you have already been doing recently for go-fuzz.

thepudds avatar Mar 29 '19 19:03 thepudds

I added the ability to compile multiple fuzz functions at once and then select between them at go-fuzz time. What isn’t implemented is the ability to have go-fuzz run all fuzz functions at once, sharing time between them.

josharian avatar Feb 12 '20 14:02 josharian

Kind of wishing I didn't delete my last comment... It was in the middle of the night, no sleep. :)

It took some digging; but, I noticed in the commit references above that 7f2a178 added the ability to inspect all funcs for ones that match the signature go-fuzz expects:

func X(data []byte) int

Where X can be of any name. IMO, it should require a prefix of of Fuzz, like Test functions do. Anyhoot...

I was so excited that I tried it out and it worked as is!

// +build gofuzz

package podcast_fuzz

import (
	"encoding/binary"

	"github.com/eduncan911/podcast"
)

func FuzzAddEnclosure(data []byte) int {
	// parse url to string
	url := string(data)
	// parse length to int64
	length, read := binary.Varint(data)
	if length <= 0 && read == 0 {
		// error converting []byte into int64
		return 0
	}

	i := podcast.Item{}
	i.AddEnclosure(url, podcast.MP3, length)
	return 1
}
$ cd fuzz
$ go-fuzz-build
$ go-fuzz -bin=podcast_fuzz-fuzz.zip
2020/02/12 12:10:18 workers: 8, corpus: 18 (2s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 0, uptime: 3s
2020/02/12 12:10:21 workers: 8, corpus: 18 (5s ago), crashers: 0, restarts: 1/0, execs: 0 (0/sec), cover: 156, uptime: 6s
2020/02/12 12:10:24 workers: 8, corpus: 18 (8s ago), crashers: 0, restarts: 1/5105, execs: 122536 (13614/sec), cover: 156, uptime: 9s

However, that was short lived as when I added a second fuzz function, I immediately got the error (and drain of excitement):

$ go-fuzz -bin=podcast_fuzz-fuzz.zip
2020/02/12 12:10:08 -func flag not provided, but multiple fuzz functions available: FuzzAddDuration, FuzzAddEnclosure

Doh...

@josharian Question, would it be acceptable to call all fuzz functions from a single parent? (and feel free to criticize if I should even be fuzzing Int64 functions or not). Something like:

func Fuzz(data []byte) int {
	if v := FuzzAddDuration(data); v != 1 {
		return v
	}
	if v := FuzzAddEnclosure(data); v != 1 {
		return v
	}
	return 1
}

Or maybe to force the run through all funcs...

func Fuzz(data []byte) int {
	val := 1
	if v := FuzzAddDuration(data); v != 1 {
		val = v
	}
	if v := FuzzAddEnclosure(data); v != 1 {
		val = v
	}
	return val
}

I would have a concern that not all use cases would make it down to the lower funcs being called in the list, like during crashes even if the crash was properly caught by the funcs up higher in the method.

I also suspect the latter could override a -1 value with a later 0 return val.

eduncan911 avatar Feb 12 '20 17:02 eduncan911

Question, would it be acceptable to call all fuzz functions from a single parent?

You can, but there are downsides: Your corpus and your coverage information gets mixed together. Sometimes that is useful, and sometimes it isn't. It's not something that go-fuzz can do automatically.

josharian avatar Feb 12 '20 19:02 josharian

IMO, it should require a prefix of of Fuzz, like Test functions do.

It does. See func isFuzzFuncName in the commit you linked to.

josharian avatar Feb 12 '20 19:02 josharian

@josharian

I added the ability to compile multiple fuzz functions at once and then select between them at go-fuzz time.

Selecting (go-fuzz -func) works very well, thanks!

Would it make sense to close this ticket as done and open another one specifically for what is left:

What isn’t implemented is the ability to have go-fuzz run all fuzz functions at once, sharing time between them.

I believe it would better represent reality for somebody looking at the backlog.

marco-m avatar Mar 22 '20 16:03 marco-m

Hi @marco-m I update the title and added an update to the description. Does it look better?

Thinking of this, any serious project will most likely have >1 package as well. This means one still needs to start multiple instances, distribute resources, etc. This should be solved by continuous fuzzing services like OSS-Fuzz, fuzzit.dev, fuzzbuzz.io. If we are talking about a poor-man's local replacement, perhaps this thing should be built on top of go-fuzz (separate binary) and also handle multiple packages as well. E.g. you give it a set a binaries, or better set of packages and it does the rest.

dvyukov avatar Mar 24 '20 08:03 dvyukov

hello @dvyukov

I update the title and added an update to the description. Does it look better?

Yes, thanks.

Thinking of this, any serious project will most likely have >1 package as well. This means one still needs to start multiple instances, distribute resources, etc.

This should be solved by continuous fuzzing services like OSS-Fuzz, fuzzit.dev, fuzzbuzz.io.

I agree.

If we are talking about a poor-man's local replacement, perhaps this thing should be built on top of go-fuzz (separate binary) and also handle multiple packages as well. E.g. you give it a set a binaries, or better set of packages and it does the rest.

That would make sense.

My 2 cents :-)

marco-m avatar Mar 24 '20 10:03 marco-m

Question, would it be acceptable to call all fuzz functions from a single parent?

You can, but there are downsides: Your corpus and your coverage information gets mixed together. Sometimes that is useful, and sometimes it isn't. It's not something that go-fuzz can do automatically.

I'm curious about a fix for "All" funcs in a package being run, and wondered if having multiple dirs (one per func, each housing a corpus, crashers, and suppressions dir) is a solution?

Also, I was thinking of letting the Go scheduler handle the load-balancing of running each function. That is, launch each of the fuzz tests with a go routine, and then leav the scheduler to decide who gets some time.

Are these reasonable ideas?

shaneHowearth avatar Nov 26 '20 00:11 shaneHowearth

go-fuzz is being superseded by the standard toolchain: https://go.googlesource.com/proposal/+/master/design/draft-fuzzing.md https://github.com/golang/go/tree/dev.fuzz Any new development in go-fuzz beyond bug fixing does not make sense now.

dvyukov avatar Nov 26 '20 06:11 dvyukov