codeql icon indicating copy to clipboard operation
codeql copied to clipboard

Fixes in cpp/global-use-before-init

Open mrigankpawagi opened this issue 5 months ago • 8 comments
trafficstars

This PR introduces two fixes in the cpp/global-use-before-init query, which currently has two very basic inaccuracies.

Write after read

Consider the following example.

int x;

int main() {
    int woof = x;
    x = 42;
}

The query does not raise an alert on this because main does not satisfy useFunc (which holds if the function never writes to, but reads from, the global variable). The fix to useFunc modifies this condition to include functions which have a read that may not be preceded by a write inside the function.

Write before function call

Consider the following example.

int x;

void foo() {
    int meow = x;
}

int main() {
    x = 0;
    foo();
}

The query raises an alert if in the recursive step of uninitializedBefore, f never writes to v. However, what we really want is that g never writes to v (and if it does, locallyUninitialisedAt handles the case). Further, it is not clear why functionInitialises does not consider initFunc to be one of the ways for a function to initialize a variable, although it is considered by initialises. This is addressed by the fixes in uninitializedBefore and functionInitialises. This also required a corresponding change in locallyUninitialisedAt.

I am aware that some more enhancements can be made, but I am not sure how open the maintainers are to modifications in this query. I would be happy to discuss more in the comments.

mrigankpawagi avatar Jun 05 '25 09:06 mrigankpawagi

Hi @mrigankpawagi,

Thanks for your contribution. We are always open to fixes to our queries. I would like to mention that we generally do not run this query, because it has performance issues due to its use of the points-to library. This also explains why some of the issues you're observing have been present for so long.

As part of this contribution, could you add some relevant tests to cpp/ql/test/query-tests/Critical/GlobalUseBeforeInit? Your code snippets from above seem like a good start.

I see your changes currently do not satisfy our formatting guidelines. The easiest way to fix this is by loading the query file in VScode with the CodeQL extension installed, and asking VScode to format the file.

jketema avatar Jun 05 '25 13:06 jketema

Thanks for your response, @jketema! I have addressed your comments.

mrigankpawagi avatar Jun 05 '25 15:06 mrigankpawagi

I feel that using local dataflow alongside some of the things already present in this query, could solve a lot of problems. However, I am not sure if it is worth the effort in the context of real-world code.

mrigankpawagi avatar Jun 13 '25 14:06 mrigankpawagi

Results of a multi-repository variant analysis run comparing the before and after: https://gist.github.com/jketema/67fbda60f0437951b94a96f4a6cb210b. This should be publicly accessible, but let me know if I'm mistaken. Note that these are not the complete results over a 1000 projects. As the query has some performance problems, there are timeouts after 6 hours.

One new result I'm seeing quite a significant number of times are uses of global variables referenced inside sizeof. These should probably be excluded. Example:

saved_len = mem_calloc(max_kpc, sizeof(*saved_len));

void init() {
  saved_len = mem_calloc(max_kpc, sizeof(*saved_len));
}

I feel that using local dataflow alongside some of the things already present in this query, could solve a lot of problems. However, I am not sure if it is worth the effort in the context of real-world code.

I would indeed let real-world code drive this if you want to make further improvements. This is generally what we do ourselves.

Note that to make the query performant enough to be included in our standard set of queries we would have to rewrite it to not use the points-to library. This does mean that at this point we would like to spend as little effort on the query as possible until we can schedule that work.

jketema avatar Jun 16 '25 20:06 jketema

This is very cool, @jketema! I would also like to run some MRVA to explore some more. However I am facing some issues... It would be very helpful if you could tell me how you ran MRVA on your end.

I have a controller repository and I am able to "Run Variant Analysis" from the VSCode CodeQL extension. I open the query I want to run, press Ctrl+P and select "CodeQL: Run Variant Analysis" (or alternatively, go the CodeQL extension pane, right click on my query in the "Queries" section and select "Run against variant analysis repositories"). This starts the CI action on my repository, but it fails with Error: Path '/home/runner/work/_temp/731388fa-0b34-4d14-8c5f-23c84b67ef51' is missing a qlpack file. even though I have a qlpack.yml file which lists codeql/cpp-all as a dependency. On looking at the CodeQL Extension log, I see that it ignored the qlpack file (and also all dependencies, and even my actually query file!) "since it falls outside of the pack root".

Sorry for the tangential question -- unfortunately there is very limited documentation on this. Any help will be highly appreciated.

mrigankpawagi avatar Jun 19 '25 18:06 mrigankpawagi

Sorry for the tangential question -- unfortunately there is very limited documentation on this. Any help will be highly appreciated.

That's not an error I've seen before. The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query: https://github.com/github/vscode-codeql-starter/blob/main/codeql-custom-queries-cpp/example.ql on the Top 10 repositories.

One word of warning if you do get this to work, if the controller repository is under your own account, you might run out of a free action minutes, especially when you run on 1000 repositories.

For the current PR, I'd recommend solving the sizeof issue I mentioned above, then we can hopefully get this merged.

jketema avatar Jun 20 '25 15:06 jketema

The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query

Thanks. I tried this but faced the same issue.

you might run out of a free action minutes, especially when you run on 1000 repositories.

Thanks for the heads up. I will take care of this.

For the current PR, I'd recommend solving the sizeof issue I mentioned above, then we can hopefully get this merged.

I will take a stab at this.

mrigankpawagi avatar Jun 21 '25 18:06 mrigankpawagi

The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query

Thanks. I tried this but faced the same issue.

I'm not able to reproduce the issue. I'd recommend opening an issue in https://github.com/github/vscode-codeql to report this.

jketema avatar Jun 23 '25 08:06 jketema

Thanks @jketema! How does this look now?

(PS I have created an issue about mrva here.)

mrigankpawagi avatar Jun 28 '25 17:06 mrigankpawagi

Thanks @jketema! How does this look now?

Thanks for the changes. I'm running MRVA on this. I'll post the results when they're available.

jketema avatar Jun 30 '25 15:06 jketema

Thanks a lot!

mrigankpawagi avatar Jun 30 '25 16:06 mrigankpawagi

New MRVA results here: https://gist.github.com/jketema/8c7f59e69f86be2834b53381298532ed

jketema avatar Jul 01 '25 09:07 jketema

Do we need a change note if the query isn't run by default?

Let's add a change note, just to err on the safe side. This needs to go in cpp/ql/src/change-notes. The note needs to satisfy the format described in https://github.com/github/codeql/blob/main/docs/change-notes.md.

Maybe something along the lines of:

---
category: minorAnalysis
---
* Fixed a number of false positives and false negatives in `cpp/global-use-before-init`. Note that this query is not part of any of the default query suites.

jketema avatar Jul 01 '25 09:07 jketema

There is a hidden CI action that seems to be failing. Hope its not an issue. Thanks!

mrigankpawagi avatar Jul 01 '25 13:07 mrigankpawagi

There is a hidden CI action that seems to be failing. Hope its not an issue. Thanks!

Actually, they were all failing. I merged main into your branch, which should resolve this.

jketema avatar Jul 01 '25 14:07 jketema

Thanks for your help on this PR @jketema! It was fun working on this :-)

mrigankpawagi avatar Jul 01 '25 14:07 mrigankpawagi