Question: Integrating 3rd party fuzzers
Fuzzers that use libfuzzer and AFL run as a single process in the host. The bot manages syncing their corpus and all other aspects of their run.
What are the requirements/API/best way to integrate other 3rd party fuzzers?
Two important cases that come in mind are:
- Fuzzers that run in containers and include some kind of API endpoing (e.g. REST) and a fuzzer process that fuzzes it
- Fuzzers that are single process but hit remote API endpoints. As such the crash will happen in another host.
Right now, those are only possible with blackbox fuzzing style - https://google.github.io/clusterfuzz/setting-up-fuzzing/blackbox-fuzzing/
Can you give a more detailed concrete example for this. We are planning to simplify this blackbox fuzzer pipeline for usecases like yours, so great timing w.r.t requirements for these.
Let's assume that there's an API https://myapi.foboar.com?q=<query>. A simple fuzzer could be just a few lines in Python:
import requests
import string
import random
import some_telemetry
def mutate_case(case):
DOMAIN = string.ascii_letters + string.punctuation
MUTATION_OPS = 5
q = bytearray(case.encode())
for i in range(MUTATION_OPS):
offset = random.randint(0, len(q) - 1)
q[offset] = random.choice(DOMAIN).encode()[0]
return q.decode()
def ray_id():
RAY_LEN = 20
return ''.join(random.choice(string.ascii_lowercase)
for i in range(RAY_LEN))
if __name__ == "__main__":
with open('corpus.txt', 'rt') as f:
corpus = [i.strip() for i in f.readlines()]
while True:
try:
q = mutate_case(random.choice(corpus))
ray = ray_id()
r = requests.get(f'https://myapi.foboar.com?q={q}',
headers={'some-telemetry-ray': ray})
if r.status_code != 200:
pass # Legit Error Code, 'q' may be interesting
except requests.exceptions.Timeout:
crash = some_telemetry.find_crash(ray=ray)
if crash:
crash.print_stack() # There was a crash. 'q' is interesting.
else:
pass # Might be a crash or not. 'q' may be interesting
It could hit an myapi.foboar.com endpoint that lives in an isolated fuzzing cluster or in a container. It's also possible the above fuzzer and the endpoint to live in the same container or Pod.
The goal for the design of this new interface is that the fuzzer will manage most aspects of fuzzing on its own. It will have a way to provide interesting test cases (e.g. crashing inputs) back to ClusterFuzz which we'll go ahead and create reports for. ClusterFuzz will still handle downloading builds for the test binary, and will run the fuzzer script by passing several arguments to it (for things like the path to the test binary, optional corpus directory, directory to write interesting outputs to, and command line arguments for the job), but everything else could be managed by the script.
From a quick skim over your example I think this could certainly work for your use case. If you have any concerns let me know and I'll try to make sure we support what you're trying to do. We intend for this to be a catch-all for fuzzers that don't work well with either our existing fuzzing engine support (for things like AFL, libFuzzer and honggfuzz), or our blackbox fuzzing support (which only handles simple cases where all inputs can be generated ahead of time and passed to a test binary after the fact).
ClusterFuzz will still handle downloading builds for the test binary, and will run the fuzzer script by passing several arguments to it (for things like the path to the test binary, optional corpus directory, directory to write interesting outputs to, and command line arguments for the job), but everything else could be managed by the script.
Sounds good. It's a level of abstraction above what is already there. Maybe an "Open-Fuzzer Interface" :)
The most challenging part - I guess - is the formalization of "interesting outputs" since they will have to enable the aggregations (i.e. bug deduplication), possible minimization or to flag of whether it's a security bug or not.
excuse,i want to know,how to run to honggfuzz locally?