codeql
codeql copied to clipboard
C++: Recognize allocation functions heuristically
This PR does two things:
- It adds a couple of new interface classes,
HeuristicAllocationExprandHeuristicAllocationFunction, that complement the already existingAllocationExprandHeuristicAllocationclasses with functions that we think may perform allocations. - It switched the new
cpp/invalid-pointer-derefquery to useHeuristicAllocationExprinstead ofAllocationExpras a source of flow.
Currently, the heuristics used for detecting an HeuristicAllocationFunction is:
- The function name matches
%alloc% - It returns a pointer type
- We can find a unique parameter of
unsignedintegral type
In addition, a HeuristicAllocationFunction is recognized as a realloc-like function if, in addition to the above, it:
- Matches the name
%realloc%, and - We can find a unique parameter of a pointer type
This finds a bunch of new allocation functions on the databases I've tried. For example, on php we recognize the following set of new allocation functions:
I've manually verified that these are all implementations that we want to recognize as allocation functions.
Commit-by-commit review encouraged.
- https://github.com/github/codeql/commit/a9710453f4366ebb999b6e383e4cbc71ec364d51 adds the new classes by wrapping a parameterized module around the logic for generating an
AllocationExprfrom a call to anAllocationFunction. - https://github.com/github/codeql/commit/d12a76559a44708274446cb4527da862ebaf5dcb uses the new class in
cpp/invalid-pointer-deref. - https://github.com/github/codeql/commit/2a514d60d448b3ef1ef0d7810c66cf158b3973a9 adds a
isBarrierInpredicate to the product dataflow library. This is necessary since many of the heuristic allocators are just simple wrappers around allocation expressions that we already know, and we'd otherwise get duplicated paths from starting flow both inside those functions, and from the return value of the function we now recognize as a heuristic allocation expression.
Finally, I had to add nomagic to a predicate in the product dataflow library because I was observing a bad join due to bad magic.
With this PR, we now recognize https://nvd.nist.gov/vuln/detail/CVE-2016-10160 out of the box 🎉.
I like the idea but I'm not following the parameterized module tricks you're doing.
I like the idea but I'm not following the parameterized module tricks you're doing.
Thanks! I'll add some comments to explain what's going on. The code looks slightly uglier than I wanted to because we cannot yet have type signature member predicates (although it's coming very soon!). So instead of writing the code like:
private signature class CallAllocationExprTarget extends Function {
int getReallocPtrArgSig();
int getSizeArgSig();
int getSizeMultSig();
predicate requiresDeallocSig();
}
private module CallAllocationExprBase<CallAllocationExprTarget Target> {
class CallAllocationExprImpl instanceof FunctionCall {
...
}
}
we have to write it as:
private signature class CallAllocationExprTarget extends Function;
private module CallAllocationExprBase<CallAllocationExprTarget Target> {
signature int getReallocPtrArgSig(Target target);
signature int getSizeArgSig(Target target);
signature int getSizeMultSig(Target target);
signature predicate requiresDeallocSig(Target target);
module With<
getReallocPtrArgSig/1 getReallocPtrArg, getSizeArgSig/1 getSizeArg, getSizeMultSig/1 getSizeMult,
requiresDeallocSig/1 requiresDealloc> {
class CallAllocationExprImpl instanceof FunctionCall {
...
}
}
}
cc @ginsbach on parameterized module usage (you can restrict your 👀 to the first commit only)
cc @ginsbach on parameterized module usage (you can restrict your 👀 to the first commit only)
Always happy to see parameterised module PRs!
Wouldn't it be easier to use a module signature, rather than 4 predicate signature arguments for With<>?
Wouldn't it be easier to use a module signature, rather than 4 predicate signature arguments for With<>?
I ... I need to lookup my parameterized module syntax, then. I very much think you're right!
Wouldn't it be easier to use a module signature, rather than 4 predicate signature arguments for
With<>?
Fixed in https://github.com/github/codeql/pull/10635/commits/483ff58c39ec8704862705bffab7aef63950883f. I suppose this rewrite will also make it a lot easier for @rdmarsh2 to see what's going on.