llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

clang-offload-bundler incorrectly errors on multi-CCOB binaries

Open stellaraccident opened this issue 1 month ago • 2 comments

(The below analysis was done by claude upon discovering that clang-offload-bundler cannot properly unbundle some of our production libraries)

=================================================================== BUG REPORT: clang-offload-bundler Fails on Concatenated CCOB Bundles

Date: 2025-10-30 Reporter: [Your name] LLVM Project: amd/comgr and clang/tools/clang-offload-bundler

=================================================================== SUMMARY

clang-offload-bundler fails to unbundle files containing multiple concatenated CCOB (Clang Code Object Bundle) compressed bundles with error: "Failed to decompress input: Could not decompress embedded file contents: Src size is incorrect"

This is caused by an incomplete fix in commit efda523188c4 (October 2025) that fixed llvm/lib/Object/OffloadBundle.cpp but did NOT fix the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.

=================================================================== AFFECTED FILES

Real-world ROCm libraries with concatenated CCOB bundles:

  • librocblas.so.5 (64 concatenated CCOBs, 8.2 MB .hip_fatbin section)
  • librocsparse.so (similar structure)
  • Likely other BLAS/sparse libraries

These libraries work fine at runtime (via COMGR) but cannot be processed by clang-offload-bundler command-line tool.

=================================================================== ROOT CAUSE

Two implementations of CCOB decompression exist with different bugs:

FIXED Implementation (llvm/lib/Object/OffloadBundle.cpp:546): StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

✅ Correctly limits decompression to first CCOB's TotalFileSize ✅ Handles concatenated CCOBs properly

BUGGY Implementation (clang/lib/Driver/OffloadBundler.cpp:1270): StringRef CompressedData = Blob.substr(HeaderSize);

❌ Reads from HeaderSize to END of buffer ❌ When buffer contains multiple concatenated CCOBs, includes all of them ❌ Zstd decompressor tries to decompress beyond first bundle boundary ❌ Encounters second CCOB header mid-stream, causes corruption/error

=================================================================== REPRODUCTION

Test File: librocblas.so.5 from ROCm 6.x distribution .hip_fatbin section: 8,163,887 bytes containing 64 concatenated CCOBs

Structure: Offset 0x0: CCOB header + 1.16 MB compressed (→ 12.41 MB uncompressed) Offset 0x129000: CCOB header + 1.01 MB compressed (→ 13.14 MB uncompressed) Offset 0x227000: CCOB header + 36.5 KB compressed (→ 1.21 MB uncompressed) ... (61 more bundles)

Command: $ clang-offload-bundler --type=o --input=librocblas.so.5 --list

Error: clang-offload-bundler: error: Failed to decompress input: Could not decompress embedded file contents: Src size is incorrect

Expected: Should list all target triples in the bundle, or at minimum process the first bundle without error.

=================================================================== WHY COMGR WORKS BUT BUNDLER FAILS

COMGR succeeds because the CLR runtime pre-isolates bundles:

  1. CLR reads first CCOB header's TotalFileSize field (hip_fatbin.cpp:275)
  2. CLR extracts exactly TotalFileSize bytes (one CCOB)
  3. CLR passes isolated bundle to COMGR via amd_comgr_do_action()
  4. COMGR writes to temporary file and calls UnbundleFiles()
  5. Even though UnbundleFiles uses buggy decompressor, there's no extra data to cause corruption

clang-offload-bundler fails because:

  1. Tool loads entire .hip_fatbin section (all 64 CCOBs)
  2. Calls decompress() on full buffer
  3. Buggy decompressor reads from HeaderSize to END
  4. Zstd tries to decompress 8+ MB thinking it's one compressed stream
  5. Encounters second CCOB's "CCOB" magic bytes mid-stream
  6. Decompression validation fails

=================================================================== GIT HISTORY

Commit: efda523188c4 (October 2025) Title: "Fix compress/decompress in LLVM Offloading API (#150064)"

Changes in llvm/lib/Object/OffloadBundle.cpp:

  • StringRef CompressedData = Blob.substr(CurrentOffset);
  • StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

This fix was applied to llvm/lib/Object/OffloadBundle.cpp but NOT to the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.

=================================================================== FIX

Apply the same fix from efda523188c4 to clang/lib/Driver/OffloadBundler.cpp:

File: clang/lib/Driver/OffloadBundler.cpp Line: ~1270 (in CompressedOffloadBundle::decompress method)

Current code: StringRef CompressedData = Blob.substr(HeaderSize);

Fixed code: StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

This makes the bundler correctly limit decompression to the first CCOB's declared size, enabling proper handling of concatenated bundles.

=================================================================== ADDITIONAL CONTEXT

The llvm/lib/Object/OffloadBundle.cpp:33-99 extractOffloadBundle() function shows the intended behavior for concatenated CCOBs:

  1. Iterate through buffer searching for CCOB magic markers
  2. Extract one CCOB at a time using take_front(NextbundleStart)
  3. Decompress isolated bundle
  4. Advance offset to next bundle
  5. Repeat until end of buffer

This loop-based extraction relies on decompress() correctly respecting TotalFileSize to avoid reading past the current bundle.

=================================================================== TEST VERIFICATION

After applying fix, verify with:

$ clang-offload-bundler --type=o --input=librocblas.so.5 --list

Expected output: List of target triples (or at minimum no error)

Test with known working file: $ clang-offload-bundler --type=o --input=librccl.so.1.0 --list

Should continue to work (single CCOB bundle).

=================================================================== AFFECTED CODE PATHS

clang/lib/Driver/OffloadBundler.cpp:

  • Line 1233-1329: CompressedOffloadBundle::decompress() [BUGGY]
  • Line 1534-1678: OffloadBundler::UnbundleFiles() [calls decompress]

llvm/lib/Object/OffloadBundle.cpp:

  • Line 509-601: CompressedOffloadBundle::decompress() [FIXED]
  • Line 33-99: extractOffloadBundle() [handles multiple CCOBs]

amd/comgr/src/comgr-compiler.cpp:

  • Line 1318-1436: AMDGPUCompiler::unbundle() [uses clang bundler]

=================================================================== REFERENCES

LLVM Repository: /home/stella/workspace/llvm-project

Key files:

  • clang/lib/Driver/OffloadBundler.cpp (needs fix)
  • llvm/lib/Object/OffloadBundle.cpp (already fixed)
  • clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp (CLI tool)
  • clang/include/clang/Driver/OffloadBundler.h (CCOB format defs)

Related CLR runtime code: /home/stella/workspace/rocm-systems/projects/clr

  • hipamd/src/hip_fatbin.cpp:252-523 (runtime bundle loading)
  • hipamd/src/hip_code_object.hpp:44-81 (CCOB format structures)

=================================================================== PRIORITY

Medium-High: Affects tooling for analyzing/repackaging ROCm libraries. The runtime works fine, but command-line unbundling is broken for production ROCm libraries like librocblas and librocsparse.

===================================================================

stellaraccident avatar Oct 31 '25 04:10 stellaraccident