clusterfuzz icon indicating copy to clipboard operation
clusterfuzz copied to clipboard

Question: Integrating 3rd party fuzzers

Open lookfwd opened this issue 5 years ago • 5 comments

Fuzzers that use libfuzzer and AFL run as a single process in the host. The bot manages syncing their corpus and all other aspects of their run.

What are the requirements/API/best way to integrate other 3rd party fuzzers?

Two important cases that come in mind are:

  • Fuzzers that run in containers and include some kind of API endpoing (e.g. REST) and a fuzzer process that fuzzes it
  • Fuzzers that are single process but hit remote API endpoints. As such the crash will happen in another host.

lookfwd avatar Jul 28 '20 14:07 lookfwd

Right now, those are only possible with blackbox fuzzing style - https://google.github.io/clusterfuzz/setting-up-fuzzing/blackbox-fuzzing/

Can you give a more detailed concrete example for this. We are planning to simplify this blackbox fuzzer pipeline for usecases like yours, so great timing w.r.t requirements for these.

inferno-chromium avatar Jul 28 '20 18:07 inferno-chromium

Let's assume that there's an API https://myapi.foboar.com?q=<query>. A simple fuzzer could be just a few lines in Python:

import requests
import string
import random
import some_telemetry


def mutate_case(case):
    DOMAIN = string.ascii_letters + string.punctuation
    MUTATION_OPS = 5
    q = bytearray(case.encode())
    for i in range(MUTATION_OPS):
        offset = random.randint(0, len(q) - 1)
        q[offset] = random.choice(DOMAIN).encode()[0]
    return q.decode()


def ray_id():
    RAY_LEN = 20
    return ''.join(random.choice(string.ascii_lowercase)
                   for i in range(RAY_LEN))


if __name__ == "__main__":
    with open('corpus.txt', 'rt') as f:
        corpus = [i.strip() for i in f.readlines()]

    while True:
        try:
            q = mutate_case(random.choice(corpus))
            ray = ray_id()
            r = requests.get(f'https://myapi.foboar.com?q={q}',
                             headers={'some-telemetry-ray': ray})
            if r.status_code != 200:
                pass  # Legit Error Code, 'q' may be interesting
        except requests.exceptions.Timeout:
            crash = some_telemetry.find_crash(ray=ray)
            if crash:
                crash.print_stack()  # There was a crash. 'q' is interesting.
            else:
                pass  # Might be a crash or not. 'q' may be interesting

It could hit an myapi.foboar.com endpoint that lives in an isolated fuzzing cluster or in a container. It's also possible the above fuzzer and the endpoint to live in the same container or Pod.

lookfwd avatar Jul 28 '20 19:07 lookfwd

The goal for the design of this new interface is that the fuzzer will manage most aspects of fuzzing on its own. It will have a way to provide interesting test cases (e.g. crashing inputs) back to ClusterFuzz which we'll go ahead and create reports for. ClusterFuzz will still handle downloading builds for the test binary, and will run the fuzzer script by passing several arguments to it (for things like the path to the test binary, optional corpus directory, directory to write interesting outputs to, and command line arguments for the job), but everything else could be managed by the script.

From a quick skim over your example I think this could certainly work for your use case. If you have any concerns let me know and I'll try to make sure we support what you're trying to do. We intend for this to be a catch-all for fuzzers that don't work well with either our existing fuzzing engine support (for things like AFL, libFuzzer and honggfuzz), or our blackbox fuzzing support (which only handles simple cases where all inputs can be generated ahead of time and passed to a test binary after the fact).

mbarbella-chromium avatar Jul 28 '20 22:07 mbarbella-chromium

ClusterFuzz will still handle downloading builds for the test binary, and will run the fuzzer script by passing several arguments to it (for things like the path to the test binary, optional corpus directory, directory to write interesting outputs to, and command line arguments for the job), but everything else could be managed by the script.

Sounds good. It's a level of abstraction above what is already there. Maybe an "Open-Fuzzer Interface" :)

The most challenging part - I guess - is the formalization of "interesting outputs" since they will have to enable the aggregations (i.e. bug deduplication), possible minimization or to flag of whether it's a security bug or not.

lookfwd avatar Jul 31 '20 17:07 lookfwd

excuse,i want to know,how to run to honggfuzz locally?

gtt1995 avatar Jan 05 '21 11:01 gtt1995