oss-fuzz icon indicating copy to clipboard operation
oss-fuzz copied to clipboard

Support for LLVM 18 fuzzing

Open Darksonn opened this issue 1 year ago • 30 comments

The Tokio build on oss-fuzz has started failing with the following output:

Running fuzz_linked_list
Running fuzz_stream_map
Traceback (most recent call last):
  File "/usr/local/bin/profraw_update.py", line 129, in <module>
    sys.exit(main())
  File "/usr/local/bin/profraw_update.py", line 120, in main
    profraw_latest = upgrade(profraw_base, sect_prf_cnts, sect_prf_data)
  File "/usr/local/bin/profraw_update.py", line 87, in upgrade
    relativize_address(data, offset + 16, dataref, sect_prf_cnts, sect_prf_data)
  File "/usr/local/bin/profraw_update.py", line 35, in relativize_address
    value = struct.unpack('Q', data[offset:offset + 8])[0]
struct.error: unpack requires a buffer of 8 bytes
warning: /workspace/out/libfuzzer-coverage-x86_64/dumps/fuzz_stream_map.16753151517152022731_0.profraw: unsupported instrumentation profile format version
error: no profile can be merged
[2024-02-21 06:24:21,463 INFO] Finding shared libraries for targets (if any).
[2024-02-21 06:24:21,473 INFO] Finished finding shared libraries for targets.
error: fuzz_stream_map: Failed to load coverage: No such file or directory
error: Could not load coverage information
error: No such file or directory: Could not read profile data!
Traceback (most recent call last):
  File "/usr/local/bin/profraw_update.py", line 129, in <module>
    sys.exit(main())
  File "/usr/local/bin/profraw_update.py", line 120, in main
    profraw_latest = upgrade(profraw_base, sect_prf_cnts, sect_prf_data)
  File "/usr/local/bin/profraw_update.py", line 87, in upgrade
    relativize_address(data, offset + 16, dataref, sect_prf_cnts, sect_prf_data)
  File "/usr/local/bin/profraw_update.py", line 35, in relativize_address
    value = struct.unpack('Q', data[offset:offset + 8])[0]
struct.error: unpack requires a buffer of 8 bytes
warning: /workspace/out/libfuzzer-coverage-x86_64/dumps/fuzz_linked_list.14910788477734546202_0.profraw: unsupported instrumentation profile format version
error: no profile can be merged

This error seems very similar to #6268, and the latest Rust release has the following in its changelog:

Update the minimum external LLVM to 16.

Let me know if the problem is on our end, or if there's anything I can do to help.

For more information, see the chromium bug for the build failure.

Darksonn avatar Feb 21 '24 12:02 Darksonn

Seeing this across all of the Rust projects I'm involved with:

  • askama
  • redis, cc @jaymell
  • rustls, cc @ctz, @cpu
  • trust-dns (should rename this to hickory-dns), cc @bluejekyll

djc avatar Feb 21 '24 12:02 djc

Also happening with qcms.

jrmuizel avatar Feb 21 '24 14:02 jrmuizel

Also with httparse.

seanmonstar avatar Feb 21 '24 15:02 seanmonstar

I presume one way to fix this would be with https://github.com/google/oss-fuzz/pull/8108 in the infra

maflcko avatar Feb 21 '24 16:02 maflcko

Coverage builds are failing for the following Rust projects as well:

  • toml_edit
  • naga
  • miniz_oxide
  • ron
  • bson-rust
  • gimli

cc @oliverchang @jonathanmetzman @DonggeLiu

manunio avatar Feb 21 '24 19:02 manunio

OK, I will do a roll this week or next depending on when my latest clusterfuzz work ends

jonathanmetzman avatar Feb 21 '24 22:02 jonathanmetzman

There is another failure from https://github.com/apache/opendal/issues/4242

Xuanwo avatar Feb 22 '24 06:02 Xuanwo

Sorry, I'm putting out some fires and will need to get back to this, later this week.

jonathanmetzman avatar Feb 27 '24 00:02 jonathanmetzman

ryu also seems to be failing for a similar reason: https://oss-fuzz-build-logs.storage.googleapis.com/log-acc3b39e-2aa0-4769-a499-4c224201e35e.txt

oliverchang avatar Mar 11 '24 06:03 oliverchang

@DavidKorczynski @AdamKorcz do you think this is something you'd be able to help with?

I believe @jonathanmetzman made a start on this, but was blocked on various projects being broken with the upgrade.

Alternatively, could we also pin Rust to an older version for now (https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-builder/install_rust.sh)?

oliverchang avatar Mar 11 '24 06:03 oliverchang

I believe @jonathanmetzman made a start on this, but was blocked on various projects being broken with the upgrade.

If there are not too many projects affected, an alternative could be to temporarily pin them to the old docker image by hash.

maflcko avatar Mar 11 '24 08:03 maflcko

Proposed a fix here https://github.com/google/oss-fuzz/pull/11681 downgrading Rust which can be used until LLVM 18 upgrade lands.

DavidKorczynski avatar Mar 13 '24 00:03 DavidKorczynski

We've upgraded to a recent clang. @DavidKorczynski Should we undo your change.

jonathanmetzman avatar May 02 '24 19:05 jonathanmetzman

I'm not sure, from the perspective one of the things I did was pin Rust to a specific version: https://github.com/google/oss-fuzz/blob/6b05655f197d1e36849cfb501730e2391ae358cf/infra/base-images/base-builder/install_rust.sh#L18

Rust was the only language that wasn't pinned to a specific build, so I thought maybe this is something we'd like to keep moving forward?

DavidKorczynski avatar May 02 '24 19:05 DavidKorczynski

There is still the issue that a raw coverage profile generated from a current rust nightly is incompatible with the llvm version inside the oss-fuzz coverage image. Currently the rust pin is still needed for this reason. However, the pin can be moved, or re-moved, the next time llvm will be bumped again. However, that also requires un-pinning all projects, first. I can look into this, later this year.

edit: Possibly related issue https://github.com/google/oss-fuzz/pull/11938 (I'll look into this as well at some point)

maflcko avatar May 03 '24 07:05 maflcko

OpenSK has the same problem, but recently, also our fuzzing workflow broke. I tried different Rust compiler versions between November 2023 and today, and all fail the same way. Is that related to this issue, or should I open a new one?

Example failing run: https://github.com/google/OpenSK/actions/runs/9254285343/job/25455744646

kaczmarczyck avatar May 29 '24 12:05 kaczmarczyck

According to https://oss-fuzz-build-logs.storage.googleapis.com/index.html#opensk the opensk build passed on May 17th. I don't see any changes on May 17th or 18th, that would cause the failure to happen, so this seems odd.

maflcko avatar May 29 '24 12:05 maflcko

OpenSK has an open coverage failure bug report since Feb 21st: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66886

I saw this issue and figured that the problem is likely central, and not related to OpenSK.

Then on May 17th, the build failures started. There was no pull request on OpenSK around that time, so I assumed that the error might also be more general and not specific to OpenSK. And I pinged this issue to see if people here are aware, or if they recognize my problem.

kaczmarczyck avatar May 29 '24 13:05 kaczmarczyck

Confirming this fixed the build problem: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=69138#c4

kaczmarczyck avatar May 30 '24 12:05 kaczmarczyck

I've attempted to review the current status of this and other various related updates and I've submitted https://github.com/google/oss-fuzz/pull/12075 to update the pinned Rust nightly to today's nightly. That being said if there are still remaining blockers I'm not aware of I can try to help hunt those down or otherwise fix our build issues via other means.

alexcrichton avatar Jun 17 '24 14:06 alexcrichton

Did you see https://github.com/google/oss-fuzz/issues/11626#issuecomment-2092460968?

I'd be surprised if the raw coverage profile version fixed itself in the meantime.

I think the required steps are:

  • [ ] Unpin all projects to use the latest base-runner(s). See git grep '@sha256:' ./projects/ output and example commit https://github.com/google/oss-fuzz/commit/d05f35b3223484d60aaa86eaf1f028921939b056
  • [ ] Bump clang/llvm. Similar to https://github.com/google/oss-fuzz/commit/a2c60af93357f2053a45816fcdefdfc02206b64f
  • [ ] Bump rust. Similar to https://github.com/google/oss-fuzz/commit/54cf7a92d169c0f1653f66dc2e5cf07bc87c19ac

If your project requires current rust-nightly, I think the only workaround for now is to disable/break the coverage build for this project.

maflcko avatar Jun 17 '24 14:06 maflcko

I saw yeah but I haven't tried to reproduce and I wasn't sure how applicable that was. Given that both seem to be using LLVM 18 I was assuming it might be stale by this point. Is there a way to see the failure that happens locally?

As for updating projects. I can grep for "2023-12-28" which has a number of hits and I can try to update those projects, but when you mentionto bump clang/llvm where is that? I thought it was

https://github.com/google/oss-fuzz/blob/7f915006751d37347078beb0040fb41c0d2a3436/infra/base-images/base-clang/checkout_build_install_llvm.sh#L52-L53

which already looked like llvm 18

alexcrichton avatar Jun 17 '24 15:06 alexcrichton

Is there a way to see the failure that happens locally?

Should be possible to reproduce in any rust project by bumping the nightly compiler for it and running the coverage build. For example:

git show 2149bb8eb67f7824414f67998ab286df083595fb | git apply --reverse
python infra/helper.py build_fuzzers --sanitizer coverage opensk
python infra/helper.py coverage opensk  # Maybe requires --no-corpus-download or --public

As for the other questions, I've updated my previous comment.

maflcko avatar Jun 17 '24 15:06 maflcko

Running that coverage command I'm getting warnings that look like:

warning: /out/...: raw profile version mismatch: Profile uses raw profile format version = 9; expected version = 8

In the image it looks like llvm-cov is 18.0.0 and digging around it looks like the profile format was bumped to 9 in LLVM 18.1.0. Rust is using 18.1.7 so if the base builder is using 18.0.0 that explains the mismatch.

I updated #12075 to remove all references to 2023-12-28 and have double-checked that all affected projects build with the new nightly version. There's only one project which uses base-builder-rust at a pinned sha256 which is cryptofuzz and I wasn't able to update that. The error didn't look related to Rust stuff, though, but I could very well be wrong.

I'm currently testing out updating to LLVM 18.1.7 and ensuring the coverage bits still work

alexcrichton avatar Jun 17 '24 15:06 alexcrichton

I'm currently testing out updating to LLVM 18.1.7 and ensuring the coverage bits still work

Coverage will likely break for the pinned projects (git grep '@sha256:' ./projects/), because they still use clang-15, last time I checked.

maflcko avatar Jun 17 '24 16:06 maflcko

Late to the party, but what is the status here ? cc @maflcko cf https://github.com/google/oss-fuzz/pull/12077#issuecomment-2236780291 (coming from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=70505 )

In case it helps : I took a look into https://reviews.llvm.org/D138846 and made https://github.com/catenacyber/oss-fuzz/tree/profraw9 : It translates/downgrades profraw version 9 to version 8 so that current clang llvm-profdata merge can read nightly-rust profraw files. But then, llvm-cov fails because there was also a bump for the elf section with profile stuff (INSTR_PROF_COVMAP_VERSION). I can do it the other way (and update profraw8 to 9)

catenacyber avatar Aug 16 '24 08:08 catenacyber

So, the alternative to llvm update to match rust, would be to patch profraw_update.py but also create a new program that updates the compiled fuzz targets covmap section to remove this new bitmap information...

TL;DR Updating llvm seems the right way to go

By the way, I am not sure profraw_update.py is needed anymore for profraw version 5 (swift ? )

catenacyber avatar Aug 16 '24 08:08 catenacyber

cc @maflcko cf #12077 (comment)

I haven't looked into profraw_update.py (or downgrade) at all.

TL;DR Updating llvm seems the right way to go

Agree. However, some legacy projects will have their coverage build broken by that. I am trying to fix them to work with llvm 18, but for some it is quite an effort, so help is appreciated. I fear for some projects it will be too hard and they will remain broken, but I am not sure what number of projects with broken coverage is acceptable.

maflcko avatar Aug 16 '24 08:08 maflcko

Agree. However, some legacy projects will have their coverage build broken by that.

So updating profraw_update.py may help there

catenacyber avatar Aug 16 '24 08:08 catenacyber

See https://github.com/google/oss-fuzz/pull/12365

catenacyber avatar Aug 16 '24 09:08 catenacyber