oss-fuzz icon indicating copy to clipboard operation
oss-fuzz copied to clipboard

Fuzz target run intermittently since Nov 2

Open pitrou opened this issue 1 month ago • 15 comments

In the arrow project it seems one of our fuzz targets (parquet-arrow-fuzz) has not been run since Nov 2. Other fuzz targets are fine. I cannot find any obvious reason for this, as the build status is successful.

This can be seen in the fuzzer statistics: https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2025-09-09&date_end=2025-11-29&fuzzer=libFuzzer_arrow_parquet-arrow-fuzz&project=arrow

pitrou avatar Nov 10 '25 13:11 pitrou

@DavidKorczynski

pitrou avatar Nov 10 '25 13:11 pitrou

When I look through the log files, I see that some of them are almost empty, for example https://console.cloud.google.com/storage/browser/_details/arrow-logs.clusterfuzz-external.appspot.com/None/libfuzzer_asan_arrow/2025-11-10/08:01:43:499859.log;tab=live_object?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22)) :

Component revisions (build rNone):
Not available.

Bot name: oss-fuzz-tworker-1-76r6
Return code: 1

pitrou avatar Nov 10 '25 16:11 pitrou

Well, it seems that fuzzer statistics have appeared for Nov 11 (not the interim days though). Did something change? In any case, I'll keep looking at the stats in the next days and will close this issue if the problem doesn't occur again.

pitrou avatar Nov 12 '25 08:11 pitrou

I also still see some empty log files, for example: https://console.cloud.google.com/storage/browser/_details/arrow-logs.clusterfuzz-external.appspot.com/None/libfuzzer_asan_arrow/2025-11-12/01:07:04:470871.log;tab=live_object?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))

Component revisions (build rNone):
Not available.

Bot name: oss-fuzz-tworker-2-hn3j
Return code: 1

pitrou avatar Nov 12 '25 09:11 pitrou

Today Nov 13 the fuzzer statistics are missing again for this particular fuzzer (libFuzzer_arrow_parquet-arrow-fuzz). The other fuzzers are ok. Is there anything I can do to investigate and solve this issue? @oliverchang

pitrou avatar Nov 14 '25 09:11 pitrou

Update: fuzz target statistics are available for Nov 15, not later days. This applies to both libFuzzer_arrow_parquet-arrow-fuzz and afl_arrow_parquet-arrow-fuzz. Attaching screenshot of latest stats.

Image

pitrou avatar Nov 20 '25 14:11 pitrou

I think all fuzz targets are not run every day. At least I can see skips in my projects, even when looking at a time-span last year.

maflcko avatar Nov 20 '25 16:11 maflcko

Well, what's weird is that prior to Nov 2, there were almost no "holes" in the daily stats (of course, this is visual inspection, so I might miss one or the other). Now I have only 3 days in the stats in the last 2 weeks.

In any case, if that's normal behavior, it would be good to have a doc or FAQ entry explaining the execution policy a bit. I can help write or review that doc if I know what to put in it :-).

pitrou avatar Nov 20 '25 17:11 pitrou

I think the docs are hidden in the code:

https://github.com/google/clusterfuzz/blob/3612e1619dc53c180274fc4a7501051facbe99a4/src/clusterfuzz/_internal/cron/fuzzer_and_job_weights.py#L173

LLM generated (experimental):

• Fuzz Scheduling

  • schedule_fuzz.py builds a weighted pool of FuzzerJob/project pairs, then randomly samples available_cpus / 2 tasks so OSS-Fuzz CPU time flows proportional to each actual_weight; we normalize per project before enqueuing (src/clusterfuzz/_internal/cron/schedule_fuzz.py:159).
  • The scheduler consumes the actual_weight property defined on FuzzerJob (weight * multiplier) so both the per-target score and per-job multiplier (above) feed into the selection (src/clusterfuzz/_internal/datastore/ data_types.py:1532).

Weight Signals

  • A daily cron job (fuzzer_and_job_weights.py) looks at BigQuery stats to adjust each FuzzTargetJob.weight before scheduling. It lowers fuzzers that show coverage stagnation because the coverage comparison query explicitly flags targets whose edge coverage changed by <1% over two weeks, dropping their weight to 0.75 (src/clusterfuzz/_internal/cron/fuzzer_and_job_weights.py:173).
  • The same cron script also defines other QuerySpecifications that reduce weights for frequent crashes, slow units, timeouts, OOMs, and startup crashes, matching your “de-prioritize plateaued or problematic targets” idea (src/clusterfuzz/_internal/cron/fuzzer_and_job_weights.py:90 and :105).
  • It boosts new fuzzers/jobs (weight = 5) when they appear for the first time so productive newcomers get extra CPU, reflecting the “coverage gains reward” notion even though there isn’t an explicit “coverage increase” rule (src/clusterfuzz/_internal/cron/fuzzer_and_job_weights.py:150).
  • Job multipliers (based on sanitizer/engine tooling and the number of fuzz targets in a job) ensure resource fairness and give faster targets proportionally more CPU (src/clusterfuzz/_internal/cron/ fuzzer_and_job_weights.py:355).

maflcko avatar Nov 20 '25 17:11 maflcko

That's... annoying, especially as we've been making a bunch of changes to that particular fuzz target in recent weeks.

pitrou avatar Nov 20 '25 21:11 pitrou

@DavidKorczynski @oliverchang Gentle ping here. Is the absence of daily stats for a fuzz target something I should worry about? Is it indicative of the fuzzer not being run regularly? How can I act on it?

pitrou avatar Dec 03 '25 08:12 pitrou

At least for us, it seems to be working. https://github.com/bitcoin/bitcoin/issues/33981 was found by Oss-Fuzz, even though the target is not run every day:

https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2025-11-19&date_end=2025-11-28&fuzzer=libFuzzer_bitcoin-core_package_rbf&project=bitcoin-core

maflcko avatar Dec 03 '25 09:12 maflcko

At least for us, it seems to be working. bitcoin/bitcoin#33981 was found by Oss-Fuzz, even though the target is not run every day:

Hmm. Ideally I'd like some confirmation from OSS-Fuzz maintainers that ours are working currently (parquet-arrow-fuzz specifically).

@emkornfield FYI

pitrou avatar Dec 08 '25 16:12 pitrou

CC @theneuralbit @sclmn who might have better luck pinging internally.

emkornfield avatar Dec 08 '25 17:12 emkornfield

we've been making a bunch of changes to that particular fuzz target in recent weeks.

I guess you already know this, but the stats and coverage are public, and they do show a new fuzz target, as well as a coverage increase in recent weeks:

https://introspector.oss-fuzz.com/project-profile?project=arrow

Looking at the line coverage, not all lines are covered, but I haven't looked deeper if they are unreachable error conditions or if they should be reachable: https://storage.googleapis.com/oss-fuzz-coverage/arrow/reports/20251202/linux/src/arrow/cpp/src/parquet/arrow/reader.cc.html#L1490.

maflcko avatar Dec 09 '25 09:12 maflcko

I think there's a good chance your project is being starved for hours :-( There has been some infra instability that I've been looking into since returning from leave. I'm optimistic we will mitigate it somewhat by end of next week.

jonathanmetzman avatar Dec 16 '25 02:12 jonathanmetzman

Wow, thanks for looking into this @jonathanmetzman :)

pitrou avatar Dec 16 '25 08:12 pitrou