codeql icon indicating copy to clipboard operation
codeql copied to clipboard

Codeql database create fails when building mozilla

Open mies47 opened this issue 1 year ago • 7 comments

I'm trying to build mozilla from this repo and create a cpp codeql database.

In order to that, I first run ./mach configure and after it's done, I run the following command to create the database:

~/codeql/codeql database create mozilla4 --language=c-cpp --command="./mach build"

Which builds mozilla successfully, but it fails in the importing trap file stage. I have included the logs for creating the database and importing dataset.

A glance of the errors is like:

[ERROR] 14135855_0.trap.br, 1: java.io.IOException: Brotli stream decoding failed
                              org.brotli.dec.BrotliInputStream.read(BrotliInputStream.java:167)
                              com.semmle.inmemory.trap.TrapInputStream.read(TrapInputStream.java:60)
                              com.semmle.inmemory.trap.TrapScanner.fill(TrapScanner.java:451)
                              com.semmle.inmemory.trap.TrapScanner.ensureNext(TrapScanner.java:428)
                              com.semmle.inmemory.trap.TrapScanner.nextToken(TrapScanner.java:61)
                              com.semmle.inmemory.trap.TRAPReader.scanTuplesAndLabels(TRAPReader.java:493)
                              com.semmle.inmemory.trap.TRAPLinker$TrapLinkDirectiveScanner.scanTuplesAndLabels(TRAPLinker.java:311)
                              com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:414)
                              com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:400)
                              com.semmle.inmemory.trap.TRAPLinker.lambda$getTasks$2(TRAPLinker.java:215)
                              com.semmle.util.concurrent.FutureUtils.lambda$mapAsync_$8(FutureUtils.java:161)
                              java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
                              java.base/java.lang.Thread.run(Unknown Source)

                               ... caused by:

                              org.brotli.dec.BrotliRuntimeException: Corrupted Huffman code histogram
                              org.brotli.dec.Decode.readComplexHuffmanCode(Decode.java:591)
                              org.brotli.dec.Decode.readHuffmanCode(Decode.java:613)
                              org.brotli.dec.Decode.readMetablockPartition(Decode.java:783)
                              org.brotli.dec.Decode.readMetablockHuffmanCodesAndContextMaps(Decode.java:825)
                              org.brotli.dec.Decode.decompress(Decode.java:1110)
                              org.brotli.dec.BrotliInputStream.read(BrotliInputStream.java:162)
                              com.semmle.inmemory.trap.TrapInputStream.read(TrapInputStream.java:60)
                              com.semmle.inmemory.trap.TrapScanner.fill(TrapScanner.java:451)
                              com.semmle.inmemory.trap.TrapScanner.ensureNext(TrapScanner.java:428)
                              com.semmle.inmemory.trap.TrapScanner.nextToken(TrapScanner.java:61)
                              com.semmle.inmemory.trap.TRAPReader.scanTuplesAndLabels(TRAPReader.java:493)
                              com.semmle.inmemory.trap.TRAPLinker$TrapLinkDirectiveScanner.scanTuplesAndLabels(TRAPLinker.java:311)
                              com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:414)
                              com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:400)
                              com.semmle.inmemory.trap.TRAPLinker.lambda$getTasks$2(TRAPLinker.java:215)
                              com.semmle.util.concurrent.FutureUtils.lambda$mapAsync_$8(FutureUtils.java:161)
                              java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
                              java.base/java.lang.Thread.run(Unknown Source)
                              at (start of line)

I can build mozilla separately following similar steps, but when I try to create a database it fails. The database is created in the end, but some of the files are missing and are not added to the database.

Can someone please point me to the right direction on how to solve this issue?

mies47 avatar Mar 21 '24 04:03 mies47

Could you zip up and upload the database directory? It will contain the logs of what is probably a crash midway through writing a compressed trap file.

smowton avatar Mar 21 '24 10:03 smowton

Could you zip up and upload the database directory? It will contain the logs of what is probably a crash midway through writing a compressed trap file.

Of course. Here's the zip of database directory. Thanks for your help

mies47 avatar Mar 21 '24 12:03 mies47

It seems that there are indeed around 360 files on which we crash, as @smowton already suspected. These seem internal errors in the C/C++ frontend that we use. You might want to re-try with CodeQL 2.16.4 or later, as that includes some C/C++ frontend improvements, which may solve your issues. Note that this concerns less than 10% of source files being compiled, so you still should have a fairly complete database.

jketema avatar Mar 21 '24 15:03 jketema

It seems that there are indeed around 360 files on which we crash, as @smowton already suspected. These seem internal errors in the C/C++ frontend that we use. You might want to re-try with CodeQL 2.16.4 or later, as that includes some C/C++ frontend improvements, which may solve your issues. Note that this concerns less than 10% of source files being compiled, so you still should have a fairly complete database.

Thanks for your suggestion. I tried it again with CodeQL 2.16.5 and I think it worked better but the issue persists. I agree that it compiles most of the code base, but I'm trying to run a query on the IPC related files and those are the files that are missing. To confirm my guess that IPC related files are not getting completely imported I tried running a query to find the files with names that match "%Child.cpp" or "%Parent.cpp" and this is the results:

"PFetchParent.cpp"
"PBackgroundIDBCursorChild.cpp"
"PBackgroundIDBCursorParent.cpp"
"PBackgroundIDBDatabaseChild.cpp"
"PBackgroundIDBDatabaseFileChild.cpp"
"PBackgroundIDBDatabaseFileParent.cpp"
"PBackgroundIDBDatabaseParent.cpp"
"SessionStoreChild.cpp"
"SessionStoreParent.cpp"
"PSessionStoreChild.cpp"
"PSessionStoreParent.cpp"
"GPUParent.cpp"
"VRLayerParent.cpp"
"ContentChild.cpp"
"RemoteDecoderManagerChild.cpp"
"ActorsParent.cpp"
"HeapSnapshotTempFileHelperParent.cpp"
"TestShellChild.cpp"
"VsyncParent.cpp"
"VsyncMainChild.cpp"
"RemoteDecoderManagerParent.cpp"
"VsyncWorkerChild.cpp"
"RDDParent.cpp"
"RemoteDecoderChild.cpp"
"TestShellParent.cpp"
"ProxyAutoConfigParent.cpp"
"ProxyAutoConfigChild.cpp"
"RemoteDecoderParent.cpp"
"RDDChild.cpp"

Which is way lesser files than what it should be. Do you think there's a way I could fix this? I appreciate your help.

mies47 avatar Mar 22 '24 18:03 mies47

Could you share the build-tracer.log file from the 2.16.5 run, which should be located somewhere in the database directory? Thanks.

jketema avatar Mar 25 '24 07:03 jketema

Of course, here's the log file.

mies47 avatar Mar 25 '24 14:03 mies47

Thanks. It seems that the tooling indeed still crashes on the files in question. There are just under 400 CodeQL C++ extractor: Backtrace: lines in the build-tracer.log. The way to solve this is to somehow fabricate a small test case that reproduces the crash and does not depend on building Mozilla, based on such a test case we can likely produce a fix. However, given that most of the code is there, creating such a test case will have low priority on our side. If you're able/willing to create such a test case, then we might be able to do something, otherwise all we will do for now is track this problem internally.

jketema avatar Mar 25 '24 14:03 jketema