zig icon indicating copy to clipboard operation
zig copied to clipboard

macos shared libraries built with MachO are not reproducible

Open jvolkman opened this issue 2 years ago • 4 comments

Zig Version

0.10.0-dev.2382+97816e3cb

Steps to Reproduce

For example:

cd /tmp
mkdir -p test1/lib test2/lib
echo "void test() {}" > test1/lib/testlib.c
echo "void test() {}" > test2/lib/testlib.c
cd test1
zig cc -shared lib/testlib.c -o lib/testlib -O2
cd ../test2
zig cc -shared lib/testlib.c -o lib/testlib -O2
cd ..
diff <(otool -l test1/lib/testlib) <(otool -l test2/lib/testlib)

1c1
< test1/lib/testlib:
---
> test2/lib/testlib:
230c230
<          name o/3901a14c16c71f6c1ebc3638cbb5ecca/testlib (offset 24)
---
>          name o/d97fd3198b9e2035c2eb01d44e6836d9/testlib (offset 24)
250c250
<     uuid D7930157-6814-3CAE-F2E9-E722E940E983
---
>     uuid 832BDC1A-C7D5-E051-1846-CE43BA7EAD6A

First, thanks for this project. I'm using it with bazel-zig-cc and the two (bazel + zig) seem like they're going to be a great match.

Bazel really does best with reproducible artifacts, so ideally there would be no differences given that the inputs are the same. When built using macos' included clang, the two outputs are identical. The problems are two Mach-O load commands: LC_ID_DYLIB and LC_UUID.

LC_ID_DYLIB is set to some path that makes sense for Zig's cache (as far as I can tell). With clang, it's set to the path of the output library relative to PWD when the build occurs, which in this case would be lib/testlib. Zig allows overriding this value with -Wl,-install_name=XXX, but this is somewhat clunky in that the command must change for each library if the name is going to be at all correct, and can't just be placed in a reused LDFLAGS or similar. Would be nice if there was a way to mimic clang/lld's behavior.

LC_UUID appears to just be random bytes all the time. With lld, this value is actually a hash of the library. lld also supports a -no_uuid option which causes this load command to be omitted entirely, which would probably be fine (although I don't really know what these UUIDs are used for).

Expected Behavior

No differences

Actual Behavior

There are difference in the LC_ID_DYLIB and LC_UUID load commands.

jvolkman avatar May 27 '22 07:05 jvolkman

Check out the Build Mode section of the documentation. In particular:

  • Debug
    • No reproducible build requirement
  • ReleaseFast
    • Reproducible build
  • ReleaseSafe
    • Reproducible build
  • ReleaseSmall
    • Reproducible build

This FAQ entry may also be helpful.

Closing as working as designed, but happy to keep discussing here.

andrewrk avatar Jun 02 '22 01:06 andrewrk

Thanks for looking @andrewrk.

It seems the Build Mode parameters don't work with zig cc. Given the FAQ entry, is the expectation that e.g. zig cc -shared lib/testlib.c -o lib/testlib -O2 would be reproducible across runs (and with Zig's cache cleared)? On my machine I'm still seeing differences in LC_ID_DYLIB and LC_UUID.

Or if this is just unsupported, I'll try to come up with some wrapper to modify these fields after linking.

jvolkman avatar Jun 02 '22 02:06 jvolkman

Reopening since it happened with -O2. Thanks for the report!

andrewrk avatar Jun 02 '22 04:06 andrewrk

@jvolkman thanks for submitting the issue, and nice find on the UUID being hash of the library - I did not know that! I think this should be easily tweaked in the linker. Also, just to clarify, UUID is used internally by the OS for tracking the binary across the system with services such as spotlight etc., but not only - if you run dsymutil on the binary to generate the dSYM bundle, the UUID is used to keep the two in sync so that the debugger can find the matching dSYM bundle containing the DWARF info anywhere on your system.

-no-uuid is not supported yet by the linker but I'll be happy to add that in also (perhaps in a follow-up PR though).

Re ID_DYLIB I think it pointing to the cache dir is a mistake - I'd actually expect it to point to lib/libX.dylib as you pointed out so I'll work on a fix for that too.

kubkon avatar Aug 10 '22 22:08 kubkon