syzkaller WIP: pkg/fuzzer: use layered queues

Instead of relying on a fuzzer-internal priority queue, utilize stackable layers of request-generating steps.

Move the functionality to a separate pkg/fuzzer/queue package.

The pkg/fuzzer/queue package can be reused to add extra processing layers on top of the fuzzing and to combine machine checking and fuzzing execution pipelines.

May 03 '24 15:05 a-nogikh

While working on moving code from the target to the host, I see lots of utilities that duplicate code and pain to update. 4 more use cases that would benefit from moving into syz-manager:

syz-crush: repeatedly execute fixed set of programs on all VMs
syz-runtest: execute a fixed set of programs, print results, exit
reproducer testing in syz-ci: run syz/C repro for fixed amount of time and exit
image testing in syz-ci: simply exit after check

All of these are similar: do something else (rather than pkg/fuzzer) after the VM checking procedure. It would nice if the new architecture would support these use-cases natively.

May 06 '24 06:05 dvyukov

Will this allow to create the fuzzer lazily after the VM checking procedure?

For that we need some dynamically configurable layer that would return nil in Next() before we create the fuzzer object and explicitly plug it in.

The somewhat problematic consequence of the PR is that all running jobs of the same type get to execute something in a round robin manner. It doesn't matter for hints/smash/candidate, but it's somewhat problematic for triage jobs, because now it takes ages to see corpus/coverage increases after we have begun fuzzing. It looks like we should still let the triage jobs that we started earlier finish first.

May 06 '24 14:05 a-nogikh

I've had to return priority queues to add order to the triage job executions. Now the question is whether these changes make pkg/fuzzer code better... The PR makes pkg/fuzzer smaller at the expense of pkg/fuzzer/queue, which might already be a good thing though.

While working on moving code from the target to the host, I see lots of utilities that duplicate code and pain to update. 4 more use cases that would benefit from moving into syz-manager:

syz-crush: repeatedly execute fixed set of programs on all VMs

syz-runtest: execute a fixed set of programs, print results, exit

reproducer testing in syz-ci: run syz/C repro for fixed amount of time and exit

image testing in syz-ci: simply exit after check

All of these are similar: do something else (rather than pkg/fuzzer) after the VM checking procedure. It would nice if the new architecture would support these use-cases natively.

It feels like it's not just the execution queues that are common here, but rather taking requests from the queue and executing them on a pool of VMs -- that's what rpc.go and manager.go do.

Repro and image testing are somewhat simpler of these 4 (if any VM crashed at any stage in the process, we don't retry, but just report the result), but syz-crush and runtest are indeed quite similar to what syz-manager does. There indeed seems to be some common functionality like "Here's a VM pool object and here are the VM indices, please serve requests from this queue using a pool of those VMs".

Syz-manager makes things a bit more complicated by dynamically resizing this VM pool to reproduce bugs.

I'm not sure whether merging all these tools into syz-manager is a good idea, but taking this functionality to e.g. a separate package and using the package in all of those cases may be a good first step.

May 06 '24 18:05 a-nogikh

For that we need some dynamically configurable layer that would return nil in Next() before we create the fuzzer object and explicitly plug it in.

Potentially we could always create all layers statically at start, but then some layers (fuzzer) won't be ready to work until they receive additional bits of information, which will need to be passed to them later somehow. Not sure if it's better, but just listing options.

May 07 '24 08:05 dvyukov

Repro and image testing are somewhat simpler of these 4 (if any VM crashed at any stage in the process, we don't retry, but just report the result), but syz-crush and runtest are indeed quite similar to what syz-manager does. There indeed seems to be some common functionality like "Here's a VM pool object and here are the VM indices, please serve requests from this queue using a pool of those VMs".

Significant part of common functionality even with image testing and repro is all of the VM booting and RPC communication (initial handshake, VM checking + normal exchanges).

May 07 '24 08:05 dvyukov

Syz-manager makes things a bit more complicated by dynamically resizing this VM pool to reproduce bugs.

Potentially we could merge repro into the normal manager operation, repro will just supply more special requests for execution. Is special-ness is that they need to run on freshly booted VMs only. We needed that for syz-verifier (for final states of bug triage we also wanted to run tests of clean VMs).

May 07 '24 08:05 dvyukov

I'm not sure whether merging all these tools into syz-manager is a good idea, but taking this functionality to e.g. a separate package and using the package in all of those cases may be a good first step.

Agree that would be cleaner. However, it's even more work and refactoring.

May 07 '24 08:05 dvyukov

I've pushed the changes that move pkg/vminfo to the new execution interface, but it has broken tools/syz-runtest. Looks like I'll need to port pkg/runtest as well.

May 07 '24 15:05 a-nogikh

I've pushed the changes that move pkg/vminfo to the new execution interface, but it has broken tools/syz-runtest. Looks like I'll need to port pkg/runtest as well.

Now you feel part of my pain with all these duplicate tools :)

May 07 '24 15:05 dvyukov

pkg/runtest is using the new interface now, the tests pass. Now I need to refactor tools/syz-runtest.

May 07 '24 17:05 a-nogikh

Mostly got tools/syz-runtest working. It runs syz programs, but still fails to run binaries: they all result in failed to run ["/tmp/syz-runtest2965546546/syz-executor"]: exit status 1.

May 10 '24 16:05 a-nogikh

Codecov Report

Attention: Patch coverage is 47.17286% with 327 lines in your changes are missing coverage. Please review.

Project coverage is 61.1%. Comparing base (ef5d53e) to head (b4a10c5).

:exclamation: Current head b4a10c5 differs from pull request most recent head 36b9899. Consider uploading reports for the commit 36b9899 to get more accurate results

Additional details and impacted files

Files	Coverage Δ
pkg/fuzzer/queue/retry.go	`100.0% <100.0%> (ø)`
pkg/vminfo/vminfo.go	`74.4% <81.8%> (-2.9%)`	:arrow_down:
syz-fuzzer/proc.go	`0.0% <0.0%> (ø)`
pkg/fuzzer/queue/prio_queue.go	`91.7% <78.6%> (ø)`
syz-manager/manager.go	`0.0% <0.0%> (ø)`
pkg/vminfo/features.go	`66.9% <73.7%> (+1.0%)`	:arrow_up:
pkg/vminfo/syscalls.go	`78.6% <71.9%> (-1.2%)`	:arrow_down:
pkg/runtest/run.go	`67.3% <79.4%> (+0.3%)`	:arrow_up:
pkg/ipc/ipc.go	`47.1% <34.5%> (-1.2%)`	:arrow_down:
pkg/fuzzer/fuzzer.go	`68.7% <61.6%> (-14.0%)`	:arrow_down:
... and 3 more

... and 5 files with indirect coverage changes

May 16 '24 09:05 codecov[bot]

tools/syz-runtest's C tests are still broken, but they were already broken on the current mainline.

May 16 '24 12:05 a-nogikh

syzkaller syzkaller copied to clipboard

WIP: pkg/fuzzer: use layered queues

Codecov Report

syzkaller
syzkaller copied to clipboard