ikos icon indicating copy to clipboard operation
ikos copied to clipboard

compile_commands.json support

Open nickdesaulniers opened this issue 5 years ago • 5 comments

Some tools like bear can generate a compile_commands.json file which is a standard format many other tools accept. (I thought scan-build supports this as input, but cant seem to find documentation at the moment).

It would be nice to feed this as input to ikos.

nickdesaulniers avatar Dec 12 '18 05:12 nickdesaulniers

I believe that scan-build doesn't support a compilation database (as it predates the compilation database format and nobody's added support for it), but clang-tidy and run-clang-tidy.py (which can both run Clang static analyzer checks) do. Ericsson CodeChecker, a web frontend for Clang static analyzers, also accepts a compilation database. They're also widely used by editors like VS Code to understand how source files are compiled to do its Intellisense stuff, and by formatters like clang-format just to know what source files are part of the build.

AbigailBuccaneer avatar Dec 12 '18 09:12 AbigailBuccaneer

Hi @nickdesaulniers and @AbigailBuccaneer, thanks for suggesting it.

It would be nice to have this, yes. I'm trying to think about the best way to implement this. We will first need to rewrite the commands (using either clang -c -emit-llvm or llvm-link) and add a few flags. Then, how do we compile everything? Should we just execute each command sequentially? This might be slow for big projects. If we want to run multiple jobs in parallel, then it seems like we are just writing another make. And we don't have the dependencies between commands, so we would need to infer them?

arthaud avatar Dec 13 '18 00:12 arthaud

Thanks for the thoughtful response @arthaud!

I think a good first step would be compile_commands.json support for projects that have already been compiled with the magic flags that ikos needs. Then the compilation database would tell ikos what source files there are, the flags that were used, and where their corresponding bitcode files are. Many projects already generate a compilation database and can have their CC and CXX variables overridden to add the necessary flags. Then ikos-scan no longer has to be responsible for any wrapping and intercepting, as build systems can generate the compilation database or users can use something like rizsotto/Bear.

Then the next part of this proposal could come later: using the compile flags from compilation_commands.json to get the compiler options that each source file was compiled with, to be able to compile from source fully correctly. This would then mean that ikos isn't responsible for wrapping the build system, and you could generate the compilation database once and then run static analyzer checks without having to run the original build system.

It probably would be beneficial to compile all the source (that isn't already compiled on disk, where the compilation database says it will be) and analyze all of the bitcode files in parallel. I don't think you have to worry about dependencies. Actually, now I think about it, a compilation database usually doesn't tell you how things are linked (it just lists cc -c commands with no cc a.o b.o ... commands), so just a compilation database on its own still might need extra options specifying to do analysis across multiple bitcode files.

Bearing in mind that there's no need for dependencies for all that compilation, if you still didn't want to write any parallel scheduling code you could write out a Makefile, or a build.ninja, that built all the bitcode files. I'm not sure how worth it that would be, it kind of defeats the aim a little bit.

AbigailBuccaneer avatar Dec 13 '18 20:12 AbigailBuccaneer

Hi @AbigailBuccaneer,

I think a good first step would be compile_commands.json support for projects that have already been compiled with the magic flags that ikos needs. Then the compilation database would tell ikos what source files there are, the flags that were used, and where their corresponding bitcode files are. Many projects already generate a compilation database and can have their CC and CXX variables overridden to add the necessary flags. Then ikos-scan no longer has to be responsible for any wrapping and intercepting, as build systems can generate the compilation database or users can use something like rizsotto/Bear.

In this use case, what is the purpose of compile_commands.json then? If everything is already compiled to llvm bitcode using ikos-scan or by manually adding the required flags, then it is ready to be analyzed with ikos. Are you suggesting to use compile_commands.json to automatically find the binaries to analyze?

Then the next part of this proposal could come later: using the compile flags from compilation_commands.json to get the compiler options that each source file was compiled with, to be able to compile from source fully correctly. This would then mean that ikos isn't responsible for wrapping the build system, and you could generate the compilation database once and then run static analyzer checks without having to run the original build system.

It probably would be beneficial to compile all the source (that isn't already compiled on disk, where the compilation database says it will be) and analyze all of the bitcode files in parallel. I don't think you have to worry about dependencies. Actually, now I think about it, a compilation database usually doesn't tell you how things are linked (it just lists cc -c commands with no cc a.o b.o ... commands), so just a compilation database on its own still might need extra options specifying to do analysis across multiple bitcode files.

Bearing in mind that there's no need for dependencies for all that compilation, if you still didn't want to write any parallel scheduling code you could write out a Makefile, or a build.ninja, that built all the bitcode files. I'm not sure how worth it that would be, it kind of defeats the aim a little bit.

I need the dependencies if I want to compile in parallel, and the compile_commands.json does not provide them. As you said, even if I have the dependencies, then I'm just writing another make or ninja. It's not really worth it?

arthaud avatar Dec 15 '18 02:12 arthaud

I do see value in using compile_commands.json as input to ikos instead of relying on ikos-scan. Ikos-scan assumed you have a working build environment. What if you don't, which happen most of the time on projects which required licensed compilers. There is a python module called compiledb that let you get compile_commands.json without compiling the sources. I use this to get the file and then wrote a little python script to use the compile_commands.json to emit llvm bit code files with clang, and link them together into a final bitcode with llvm-link. Just like what ikos-scan would do but now I can skip over whatever native compiling options that caused errors. I then feed this final bitcode file to ikos for static analysis. I don't need to have a built environment. So in a way it's easier to use compile_commands.json to start your static analysis. You can modify the compile_commands.json file to skip over incompatible flags. Don't need to touch the source make files. I ran into a lot of incompatible flags with clang when I use ikos-scan so that why this idea came about. Also it doesn't make sense to keep update ikos-scan to add in new incompatible compiler flags.

vietcali avatar Aug 07 '19 19:08 vietcali