gllvm
gllvm copied to clipboard
Unsound bitcode collection when a single file is compiled multiple times
GLLVM doesn't currently distinguish between multiple compilations of the same input file in a single build. For example, imagine the following:
all: foo.exe foo.patched.exe
%.exe: $(SRC_DIR)/%.c
mkdir -p $(dir $@)
$(CC) $(CFLAGS) -o $@ $^
%.patched.exe: $(SRC_DIR)/%.c
mkdir -p $(dir $@)
$(CC) $(CFLAGS) -DPATCHED=1 -o $@ $^
When make all is run, foo.c is compiled twice: once with -DPATCHED=1 and once without.
GLLVM however only produces only one .foo.c.{o,bc} tuple, meaning that the get-bc-collected bitcode for both foo.exe and foo.patched.exe is the same (whichever target make ran last).
I think the solution here is to rewrite GLLVM's object and bitcode file emission to use content-addressed filenames, rather than path-computed filenames.
GLLVM however only produces only one
.foo.c.{o,bc}tuple, meaning that theget-bc-collected bitcode for bothfoo.exeandfoo.patched.exeis the same (whichever targetmakeran last).
To be more precise: GLLVM actually produces two tuples, but clobbers the first (the foo.exe one) with the second (foo.patched.exe).
I wonder if we can please all of the build systems all of the time. If the output was called foo_patched rather than
foo.patched, we'd be OK, right?
I wonder if we can please all of the build systems all of the time. If the output was called
foo_patchedrather thanfoo.patched, we'd be OK, right?
I'm not 100% sure -- I think the confusion happens with the source files, since GLLVM special-cases the "single" compilation mode and will clobber .foo.c.o and .foo.c.bc regardless of the output target.
Yeah you are right.