codeql
codeql copied to clipboard
Fixes in cpp/global-use-before-init
This PR introduces two fixes in the cpp/global-use-before-init query, which currently has two very basic inaccuracies.
Write after read
Consider the following example.
int x;
int main() {
int woof = x;
x = 42;
}
The query does not raise an alert on this because main does not satisfy useFunc (which holds if the function never writes to, but reads from, the global variable). The fix to useFunc modifies this condition to include functions which have a read that may not be preceded by a write inside the function.
Write before function call
Consider the following example.
int x;
void foo() {
int meow = x;
}
int main() {
x = 0;
foo();
}
The query raises an alert if in the recursive step of uninitializedBefore, f never writes to v. However, what we really want is that g never writes to v (and if it does, locallyUninitialisedAt handles the case). Further, it is not clear why functionInitialises does not consider initFunc to be one of the ways for a function to initialize a variable, although it is considered by initialises. This is addressed by the fixes in uninitializedBefore and functionInitialises. This also required a corresponding change in locallyUninitialisedAt.
I am aware that some more enhancements can be made, but I am not sure how open the maintainers are to modifications in this query. I would be happy to discuss more in the comments.
Hi @mrigankpawagi,
Thanks for your contribution. We are always open to fixes to our queries. I would like to mention that we generally do not run this query, because it has performance issues due to its use of the points-to library. This also explains why some of the issues you're observing have been present for so long.
As part of this contribution, could you add some relevant tests to cpp/ql/test/query-tests/Critical/GlobalUseBeforeInit? Your code snippets from above seem like a good start.
I see your changes currently do not satisfy our formatting guidelines. The easiest way to fix this is by loading the query file in VScode with the CodeQL extension installed, and asking VScode to format the file.
Thanks for your response, @jketema! I have addressed your comments.
I feel that using local dataflow alongside some of the things already present in this query, could solve a lot of problems. However, I am not sure if it is worth the effort in the context of real-world code.
Results of a multi-repository variant analysis run comparing the before and after: https://gist.github.com/jketema/67fbda60f0437951b94a96f4a6cb210b. This should be publicly accessible, but let me know if I'm mistaken. Note that these are not the complete results over a 1000 projects. As the query has some performance problems, there are timeouts after 6 hours.
One new result I'm seeing quite a significant number of times are uses of global variables referenced inside sizeof. These should probably be excluded. Example:
saved_len = mem_calloc(max_kpc, sizeof(*saved_len));
void init() {
saved_len = mem_calloc(max_kpc, sizeof(*saved_len));
}
I feel that using local dataflow alongside some of the things already present in this query, could solve a lot of problems. However, I am not sure if it is worth the effort in the context of real-world code.
I would indeed let real-world code drive this if you want to make further improvements. This is generally what we do ourselves.
Note that to make the query performant enough to be included in our standard set of queries we would have to rewrite it to not use the points-to library. This does mean that at this point we would like to spend as little effort on the query as possible until we can schedule that work.
This is very cool, @jketema! I would also like to run some MRVA to explore some more. However I am facing some issues... It would be very helpful if you could tell me how you ran MRVA on your end.
I have a controller repository and I am able to "Run Variant Analysis" from the VSCode CodeQL extension. I open the query I want to run, press Ctrl+P and select "CodeQL: Run Variant Analysis" (or alternatively, go the CodeQL extension pane, right click on my query in the "Queries" section and select "Run against variant analysis repositories"). This starts the CI action on my repository, but it fails with Error: Path '/home/runner/work/_temp/731388fa-0b34-4d14-8c5f-23c84b67ef51' is missing a qlpack file. even though I have a qlpack.yml file which lists codeql/cpp-all as a dependency. On looking at the CodeQL Extension log, I see that it ignored the qlpack file (and also all dependencies, and even my actually query file!) "since it falls outside of the pack root".
Sorry for the tangential question -- unfortunately there is very limited documentation on this. Any help will be highly appreciated.
Sorry for the tangential question -- unfortunately there is very limited documentation on this. Any help will be highly appreciated.
That's not an error I've seen before. The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query: https://github.com/github/vscode-codeql-starter/blob/main/codeql-custom-queries-cpp/example.ql on the Top 10 repositories.
One word of warning if you do get this to work, if the controller repository is under your own account, you might run out of a free action minutes, especially when you run on 1000 repositories.
For the current PR, I'd recommend solving the sizeof issue I mentioned above, then we can hopefully get this merged.
The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query
Thanks. I tried this but faced the same issue.
you might run out of a free action minutes, especially when you run on 1000 repositories.
Thanks for the heads up. I will take care of this.
For the current PR, I'd recommend solving the
sizeofissue I mentioned above, then we can hopefully get this merged.
I will take a stab at this.
The first thing I would try is to setup a CodeQL starter workspace (https://github.com/github/vscode-codeql-starter) and have a look if you can run the example C++ query
Thanks. I tried this but faced the same issue.
I'm not able to reproduce the issue. I'd recommend opening an issue in https://github.com/github/vscode-codeql to report this.
Thanks @jketema! How does this look now?
Thanks for the changes. I'm running MRVA on this. I'll post the results when they're available.
Thanks a lot!
New MRVA results here: https://gist.github.com/jketema/8c7f59e69f86be2834b53381298532ed
Do we need a change note if the query isn't run by default?
Let's add a change note, just to err on the safe side. This needs to go in cpp/ql/src/change-notes. The note needs to satisfy the format described in https://github.com/github/codeql/blob/main/docs/change-notes.md.
Maybe something along the lines of:
---
category: minorAnalysis
---
* Fixed a number of false positives and false negatives in `cpp/global-use-before-init`. Note that this query is not part of any of the default query suites.
There is a hidden CI action that seems to be failing. Hope its not an issue. Thanks!
There is a hidden CI action that seems to be failing. Hope its not an issue. Thanks!
Actually, they were all failing. I merged main into your branch, which should resolve this.
Thanks for your help on this PR @jketema! It was fun working on this :-)