clang-offload-bundler incorrectly errors on multi-CCOB binaries
(The below analysis was done by claude upon discovering that clang-offload-bundler cannot properly unbundle some of our production libraries)
=================================================================== BUG REPORT: clang-offload-bundler Fails on Concatenated CCOB Bundles
Date: 2025-10-30 Reporter: [Your name] LLVM Project: amd/comgr and clang/tools/clang-offload-bundler
=================================================================== SUMMARY
clang-offload-bundler fails to unbundle files containing multiple concatenated CCOB (Clang Code Object Bundle) compressed bundles with error: "Failed to decompress input: Could not decompress embedded file contents: Src size is incorrect"
This is caused by an incomplete fix in commit efda523188c4 (October 2025) that fixed llvm/lib/Object/OffloadBundle.cpp but did NOT fix the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.
=================================================================== AFFECTED FILES
Real-world ROCm libraries with concatenated CCOB bundles:
- librocblas.so.5 (64 concatenated CCOBs, 8.2 MB .hip_fatbin section)
- librocsparse.so (similar structure)
- Likely other BLAS/sparse libraries
These libraries work fine at runtime (via COMGR) but cannot be processed by clang-offload-bundler command-line tool.
=================================================================== ROOT CAUSE
Two implementations of CCOB decompression exist with different bugs:
FIXED Implementation (llvm/lib/Object/OffloadBundle.cpp:546): StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
✅ Correctly limits decompression to first CCOB's TotalFileSize ✅ Handles concatenated CCOBs properly
BUGGY Implementation (clang/lib/Driver/OffloadBundler.cpp:1270): StringRef CompressedData = Blob.substr(HeaderSize);
❌ Reads from HeaderSize to END of buffer ❌ When buffer contains multiple concatenated CCOBs, includes all of them ❌ Zstd decompressor tries to decompress beyond first bundle boundary ❌ Encounters second CCOB header mid-stream, causes corruption/error
=================================================================== REPRODUCTION
Test File: librocblas.so.5 from ROCm 6.x distribution .hip_fatbin section: 8,163,887 bytes containing 64 concatenated CCOBs
Structure: Offset 0x0: CCOB header + 1.16 MB compressed (→ 12.41 MB uncompressed) Offset 0x129000: CCOB header + 1.01 MB compressed (→ 13.14 MB uncompressed) Offset 0x227000: CCOB header + 36.5 KB compressed (→ 1.21 MB uncompressed) ... (61 more bundles)
Command: $ clang-offload-bundler --type=o --input=librocblas.so.5 --list
Error: clang-offload-bundler: error: Failed to decompress input: Could not decompress embedded file contents: Src size is incorrect
Expected: Should list all target triples in the bundle, or at minimum process the first bundle without error.
=================================================================== WHY COMGR WORKS BUT BUNDLER FAILS
COMGR succeeds because the CLR runtime pre-isolates bundles:
- CLR reads first CCOB header's TotalFileSize field (hip_fatbin.cpp:275)
- CLR extracts exactly TotalFileSize bytes (one CCOB)
- CLR passes isolated bundle to COMGR via amd_comgr_do_action()
- COMGR writes to temporary file and calls UnbundleFiles()
- Even though UnbundleFiles uses buggy decompressor, there's no extra data to cause corruption
clang-offload-bundler fails because:
- Tool loads entire .hip_fatbin section (all 64 CCOBs)
- Calls decompress() on full buffer
- Buggy decompressor reads from HeaderSize to END
- Zstd tries to decompress 8+ MB thinking it's one compressed stream
- Encounters second CCOB's "CCOB" magic bytes mid-stream
- Decompression validation fails
=================================================================== GIT HISTORY
Commit: efda523188c4 (October 2025) Title: "Fix compress/decompress in LLVM Offloading API (#150064)"
Changes in llvm/lib/Object/OffloadBundle.cpp:
- StringRef CompressedData = Blob.substr(CurrentOffset);
- StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
This fix was applied to llvm/lib/Object/OffloadBundle.cpp but NOT to the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.
=================================================================== FIX
Apply the same fix from efda523188c4 to clang/lib/Driver/OffloadBundler.cpp:
File: clang/lib/Driver/OffloadBundler.cpp Line: ~1270 (in CompressedOffloadBundle::decompress method)
Current code: StringRef CompressedData = Blob.substr(HeaderSize);
Fixed code: StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
This makes the bundler correctly limit decompression to the first CCOB's declared size, enabling proper handling of concatenated bundles.
=================================================================== ADDITIONAL CONTEXT
The llvm/lib/Object/OffloadBundle.cpp:33-99 extractOffloadBundle() function shows the intended behavior for concatenated CCOBs:
- Iterate through buffer searching for CCOB magic markers
- Extract one CCOB at a time using take_front(NextbundleStart)
- Decompress isolated bundle
- Advance offset to next bundle
- Repeat until end of buffer
This loop-based extraction relies on decompress() correctly respecting TotalFileSize to avoid reading past the current bundle.
=================================================================== TEST VERIFICATION
After applying fix, verify with:
$ clang-offload-bundler --type=o --input=librocblas.so.5 --list
Expected output: List of target triples (or at minimum no error)
Test with known working file: $ clang-offload-bundler --type=o --input=librccl.so.1.0 --list
Should continue to work (single CCOB bundle).
=================================================================== AFFECTED CODE PATHS
clang/lib/Driver/OffloadBundler.cpp:
- Line 1233-1329: CompressedOffloadBundle::decompress() [BUGGY]
- Line 1534-1678: OffloadBundler::UnbundleFiles() [calls decompress]
llvm/lib/Object/OffloadBundle.cpp:
- Line 509-601: CompressedOffloadBundle::decompress() [FIXED]
- Line 33-99: extractOffloadBundle() [handles multiple CCOBs]
amd/comgr/src/comgr-compiler.cpp:
- Line 1318-1436: AMDGPUCompiler::unbundle() [uses clang bundler]
=================================================================== REFERENCES
LLVM Repository: /home/stella/workspace/llvm-project
Key files:
- clang/lib/Driver/OffloadBundler.cpp (needs fix)
- llvm/lib/Object/OffloadBundle.cpp (already fixed)
- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp (CLI tool)
- clang/include/clang/Driver/OffloadBundler.h (CCOB format defs)
Related CLR runtime code: /home/stella/workspace/rocm-systems/projects/clr
- hipamd/src/hip_fatbin.cpp:252-523 (runtime bundle loading)
- hipamd/src/hip_code_object.hpp:44-81 (CCOB format structures)
=================================================================== PRIORITY
Medium-High: Affects tooling for analyzing/repackaging ROCm libraries. The runtime works fine, but command-line unbundling is broken for production ROCm libraries like librocblas and librocsparse.
===================================================================