Potential cache issue leading to inconsistent
Zig Version
0.14.0
(Built on M1 mac running Sonoma 14.2.1)
Steps to Reproduce and Observed Behavior
While trying to build a image for a microcontroller, with a custom entry point, I noticed that my build was not consistently producing a binary with the same header.
If I have commit A, which works, B, removes where we set the Build.Step.Compile.entry symbol, and then C, which re-adds it, I can sometimes build B and yield a binary with the entrypoint still set according to the entry field.
This behaviour is inconsistent, and sometimes hard (for others) to reproduce.
I have a repro branch in microzig called build_bug.
You can try to repro by cloning the repo and running the following from the examples/raspberrypi/rp2xxxx directory.
#!/bin/bash
set -x
rm -r zig-local/ zig-global/
git switch --quiet --detach build_bug^^ # Good
zig build -Dexample=ram_b --release=small --global-cache-dir "$PWD/zig-global" --cache-dir "$PWD/zig-local"
readelf -h zig-out/firmware/ram_blinky.elf | grep Entry
git switch --quiet --detach build_bug^ # Bad?
zig build -Dexample=ram_b --release=small --global-cache-dir "$PWD/zig-global" --cache-dir "$PWD/zig-local"
readelf -h zig-out/firmware/ram_blinky.elf | grep Entry
git switch --quiet build_bug # Good
zig build -Dexample=ram_b --release=small --global-cache-dir "$PWD/zig-global" --cache-dir "$PWD/zig-local"
readelf -h zig-out/firmware/ram_blinky.elf | grep Entry
git switch --quiet --detach build_bug^ # Bad?
zig build -Dexample=ram_b --release=small --global-cache-dir "$PWD/zig-global" --cache-dir "$PWD/zig-local"
readelf -h zig-out/firmware/ram_blinky.elf | grep Entry
Expected Behavior
Expected output is
Entry point address: 0x20000001
Entry point address: 0x20000645
Entry point address: 0x20000001
Entry point address: 0x20000645
But I sometimes get
Entry point address: 0x20000001
Entry point address: 0x20000001
Entry point address: 0x20000001
Entry point address: 0x20000001
Which suggests that it's not always using the entry field in the Compile step which changes across each commit.
Do you know if this is a regression in 0.14.0?
Unfortunately I don't, and the repo won't build on 0.13.0 for some time, but I can try to apply my three commits onto an old branch.
I don't think that effort would be hugely valuable; since cache bugs are often to do with filesystem races, it's quite easy for a change to expose an existing cache bug by coincidentally making it more likely. For instance, that happened with #23110; that bug wasn't actually a 0.14.0 regression, but a change I made in that release cycle happened to make it more likely.
It's possible that this is a manifestation of #23110; that depends whether it can be repro'd on master (I'm aware you're currently trying to figure out a more consistent repro before trying that out). But that doesn't seem hugely likely to me.
On master, when I build with a fresh cache, I get the consistent 0x20000001 entry point, which is incorrect for the second and fourth build.
If I immediately build again with the normal cache directory, I 'correctly' get the weird entry point.
❯ zig version
0.15.0-dev.515+833d4c9ce
❯ bash bla
Entry point address: 0x20000001
Entry point address: 0x20000001
Entry point address: 0x20000001
Entry point address: 0x20000001
❯ zig build -Dexample=ram_b --release=small
❯ readelf.py -h zig-out/firmware/ram_blinky.elf |grep Entry
Entry point address: 0x20000635
❯ git l -3 build_bug
* 8cb66652 2025-05-12 Grazfather (origin/build_bug, rp_ram_image, build_bug) Revert "no entry override"
* 96bd19f7 2025-05-12 Grazfather (HEAD) no entry override
* af93e8f2 2025-05-12 Grazfather wip ram image
Is it possible that the underlying problem is as simple as the -fentry family of options not being included in the hash? From a quick glance at Compilation.zig it doesn't look like it hashes the entry point.
I can also reproduce a similar issue with the following on master:
// build.zig
const std = @import("std");
pub fn build(b: *std.Build) void {
const exe = b.addExecutable(.{
.name = "repro",
.root_module = b.createModule(.{
.target = b.resolveTargetQuery(.{ .cpu_arch = .wasm32, .os_tag = .wasi }),
.optimize = .ReleaseSmall,
.root_source_file = b.path("build.zig"),
}),
});
if (b.option([]const u8, "entry", "")) |name| exe.entry = .{ .symbol_name = name };
b.installArtifact(exe);
}
pub fn main() void {
std.debug.print("main\n", .{});
}
export fn other() void {
std.debug.print("other\n", .{});
}
If I run zig build with a clean cache and inspect the Wasm output, I see that it exports _start. If I then run zig build -Dentry=other with the same cache the build completes instantly (suggesting a cache hit) and the Wasm output still exports _start. It's only after I clear the cache I see the other symbol instead.
Hah, you're right. There are actually also a few other link options which aren't put into the cache manifest. I'll put up a PR fixing all of those soon.