codeql-cli-binaries icon indicating copy to clipboard operation
codeql-cli-binaries copied to clipboard

Some files missing after database creation

Open mvanotti opened this issue 5 years ago • 12 comments

While trying to create a database for the fuchsia operating system, it seems to leave out some of the files that were built.

Note that the steps to build fuchsia and a codeql database require ~200GB of disk space. I just want to get a better understanding of why it might be skipping some of the files.


Steps to reproduce:

  1. Follow steps on the fuchsia site to download an build fuchsia
  2. Inside the fuchsia directory, run: fx set workstation.x64 --with-base //bundles:kitchen_sink --no-goma
  3. Build the database: codeql database create fuchsia-ql --language=cpp --source-root=${FUCHSIA_DIR} --command="./scripts/fx clean-build"
  4. List the files with a script similar to this one
  5. List all the fuchsia cc, ignoring the third_party files: find ${FUCHSIA_DIR} -iname "*.cc" | grep -v "third_party"
  6. Compare the files in the database with the cc files. Some of the files are missing.

This is the list that I have in my system. In particular, files like /zircon/tools/zbi/zbi.cc should always be built.

To get a list of all the compiled files in fuchsia, run: fx compdb and then look at out/default/compile_commands.json and out/default.zircon/compile_commands.json

mvanotti avatar Jan 25 '20 01:01 mvanotti

We crash while extracting that file so it doesn't end up in the database. We can reproduce it and we are working on a fix.

alexet avatar Jan 27 '20 13:01 alexet

Can you please give the output of codeql --version? There are several different issues going on here and I want to check that I am seeing the same issues you are.

matt-gretton-dann avatar Jan 28 '20 12:01 matt-gretton-dann

Hi Matt,

$ ./codeql/codeql --version
CodeQL command-line toolchain.
Version: 2.0.1.
Copyright (C) 2019 GitHub, Inc.

I'm using the latest released version downloaded from this github repo

mvanotti avatar Jan 29 '20 19:01 mvanotti

Thank you for that info. I have identified several different issues which in our code base which are being triggered by the fuchsia build. I am working on fixes - but unfortunately they are unlikely to land in the next release of CodeQL. I will keep you updated on how the work is going through this ticket.

matt-gretton-dann avatar Jan 31 '20 11:01 matt-gretton-dann

Cool! Let me know if there's anything I can do to help.

mvanotti avatar Feb 02 '20 03:02 mvanotti

We have fixes internally now for a couple of the main issues (lld not recognised as a linker, and the vast majority of extraction failures). These should make it into the next release of the CodeQL CLI Binaries (v2.0.3). I do not have a date for that release yet.

matt-gretton-dann avatar Feb 11 '20 17:02 matt-gretton-dann

(@matt-gretton-dann actually 2.0.3 will probably be a point release after 2.0.2 to fix just a single brown-paper-bag bug, and probably out tomorrow. The one that contains your fixes this will then be 2.0.4).

hmakholm avatar Feb 11 '20 18:02 hmakholm

Do you know if the change made it a release? We are at 2.1.0

mvanotti avatar Apr 16 '20 00:04 mvanotti

/zircon/tools/zbi/zbi.cc is now being included in my database. Will continue testing to see if there's something else missing.

mvanotti avatar Apr 16 '20 19:04 mvanotti

I ran the following query to list all the files:

import cpp

from File f
select f.getAbsolutePath(), ""

I see 8046 different files.

And also gathered all the files from compile_commands.json used during compilation in fuchsia (both in out/default and out/default.zircon). Doing a sort | uniq, I get 9320 different cc files.

This means that there are ~1.3k files that are missing. This gist contains the list of files in the db and in compile_commands.json.

Some examples of missing files (by eye) are some of the autogenerated fidl files, some unit tests, and some random files that I have no idea why are missing.

mvanotti avatar Apr 16 '20 20:04 mvanotti

Hi! I've re-ran this test with CodeQL CLI version 2.2.4, and found similar results.

By looking only at c and cc files, I see that there are 1140 missing files from the database, albeit ~700 of them are test files, but they do appear in compile_commands.json.

mvanotti avatar Aug 17 '20 23:08 mvanotti

@mvanotti Are you still running into this?

Manouchehri avatar Feb 07 '22 20:02 Manouchehri