gllvm
gllvm copied to clipboard
Capturing the command-line arguments used on each translation unit
Hi there,
Does gllvm support capturing the command-line arguments (not underlying driver arguments) used on each translation unit?
For example, if I had the following runs:
gclang -o foo.o -flag1 -flag2 -flag3 foo.c
gclang -o bar.o -flag1 -flag2 -flag4 bar.c
gclang -o bar -lwhatever foo.o bar.o
I'd like the following mapping stored in a section stored somewhere in bar:
foo.o = clang -o foo.o -flag1 -flag2 -flag3 foo.c
bar.o = clang -o bar.o -flag1 -flag2 -flag4 bar.c
bar = clang -o bar -lwhatever foo.o bar.o
I'm aware that I can approximate this at the clang/LLVM level with -grecord-gcc-switches or -frecord-command-line, but was curious if I could do the same at the gllvm level.
This is something I could try to contribute, if there's interest.
So you want to create another section that contains the commands used to generate the compilation unit?
I guess that is possible, but isn't in the code. It shouldn't be too hard, after all that is how the bitcode is "recorded".
There has been occasional mumbling about needing something like this.
So you want to create another section that contains the commands used to generate the compilation unit?
Yep, exactly. And yeah, I figure I could reuse the current section techniques/code to stash it.
I'll look into it a bit.
And I guess an additional switch to get-bc that dumps the information out to a file,
like the manifest switch does.
Sounds reasonable. I remember @HassenSaidi complained that we lost the necessary information to relink the bitcode.
@ianamason @woodruffw : I did run into this issue in the past. I had more complex scenarios involving changes to the .o files between their creation and the linking. Imagine for instance changing the name of a symbol between generating the .o file and linking it. So to do this properly, the section containing the commands should be generated by tracking all file changes during the build process.
So you gave up on this and created that: https://github.com/trailofbits/blight. I'll leave this here, so others can follow.
@woodruffw I took a look at blight, but I haven't tried it out. Is there any other black magic other than creating a directory containing your wrappers and sticking that directory at the front of the PATH? What happens when build systems do bad things like call hard coded paths to tools? Like /usr/bat/shit/crazy/clang?
@ianamason we use two techniques:
- Most build systems respect
CC,CXX, etc., so we simply point those toblight-cc,blight-c++, etc. - If that doesn't work (e.g. if a build hardcodes
clang++instead of using$(CC)), we do the$PATHtrick you mentioned. In that case,/tmp/.../clang++becomes a shim aroundblight-c++.
That leaves the worst case, i.e. a fully qualified path like /usr/bat/shit/crazy/clang. We don't handle those at all at the moment, since we (experimentally) haven't run into too many real world builds that actually do that. However, we could in theory handle those by tracing the child process's exec* family calls and looking for things that look like build tools. I believe that's what tools like bear do.
Thanks! I thought I saw a discussion that cmake doesn't respect AR, is that right?
That sounds right, although I'm not 100% sure -- I know they have their own CMAKE_AR variable instead, but I'm not sure if that's the sole variable or whether it just takes precedence.