codeql icon indicating copy to clipboard operation
codeql copied to clipboard

Advice in out-of-memory message can be misleading on Windows

Open philipp-naused opened this issue 1 year ago • 5 comments

When running the database analyze command, CodeQL seems to check if you have enough memory available. e.g., If I run this command on a machine that has less than 16 GB of memory: codeql database analyze --ram=16000 I get the following error message: CodeQL is out of memory. Try increasing the memory available to CodeQL using the --ram option.

This error is misleading since I have to reduce the value of the --ram parameter to avoid this issue. Using a lower value solves the issue.

For reference: I'm using codeql-bundle-v2.16.0 on Windows Server 2022 hosted on Hyper-V with up to 16 GB of dynamic memory and 4 vCPUs. The analyzed db contains about 4.6 M lines of code of C#.

philipp-naused avatar Feb 08 '24 13:02 philipp-naused

@hmakholm, is this possibly related to your recent changes to ram calculation?

aeisenberg avatar Feb 08 '24 19:02 aeisenberg

I don't think this is directly related to those recent changes.

The real underlying cause is that we're developing on Linux systems, and what happens there if you specify more --ram than the machine has is that the OS will happily allocate so-and-so much address space to the process, but then if we try to use it (and there's not swap space to resort to), the kernel forcibly terminates the process with SIGKILL -- the dreaded "exit code 137" symptom. So in that situation we never even get to try to produce a nice error message.

I think I recall that on Windows (which @philipp-naused is using here), the OS will instead refuse to provide address space when the JVM attempts to extend the heap, and that would lead to an OutOfMemoryError from Java, which we do get to try to report nicely. But we then think the OutOfMemoryError was because the heap did grow to the size specified by --ram but still wasn't enough to analyze the database.

I don't think we've previously been aware that this assumption can be wrong on Windows, so thanks for the report. We'll look into making the advice in the message less overconfident, based on the platform setting.

hmakholm avatar Feb 08 '24 20:02 hmakholm

(The reason why reducing the --ram setting solves this problem, is that knowing the actual heap limit will tell the CodeQL evaluator it needs to be more aggressive about removing intermediate results from the Java heap, writing them to disk storage if necessary. On the other hand, if it thinks it has gigabytes and gigabytes to play with, it will hold on to them on the theory that they could turn out to be useful later in the analysis).

hmakholm avatar Feb 08 '24 21:02 hmakholm

That would explain why I had to set the max-disk-cache parameter to avoid running out of memory. We've had quite a few problems with CodeQL's memory usage. Our agents don't have a lot of RAM and need to scan a 5 Mloc db.

It seems this combination works: codeql database analyze CodeQL-db/csharp --format=sarifv2.1.0 --ram=8000 --max-disk-cache=100000 --threads=0 --no-rerun --output=CodeQL-Results/csharp.sarif --sarif-category=csharp With up to 3 retries in case the JVM crashes.

philipp-naused avatar Feb 09 '24 08:02 philipp-naused

@hmakholm I think your theory could be right. In our case, the error occurs immediately when running the command. The other out-of-memory problems we mitigated with --max-disk-cache only happened during the evaluation of the queries.

philipp-naused avatar Feb 09 '24 08:02 philipp-naused

An updated wording of this error message on Windows will be in release 2.17.0 of the CLI.

hmakholm avatar Mar 20 '24 21:03 hmakholm