codeql icon indicating copy to clipboard operation
codeql copied to clipboard

CPP: Nested Conditionals and BarrierGuards

Open bdrodes opened this issue 3 years ago • 2 comments
trafficstars

I'm trying to detect sanitizing/barrier guards in more complex control flow. In another BarrierGarud issue I opened (https://github.com/github/codeql/issues/10011), we established how to address complex dataflow into the barrier guard, but the remaining issue I have is that the underlying mechanics of these guards (whether using barrier guards directly or the mechanic in https://github.com/github/codeql/issues/10011) relies on the 'controls' predicate.

Here is a null dereference example that use of barrier guards or the mechanism discussed in https://github.com/github/codeql/issues/10011 works fine with (the query correctly identifies the bad case, but not the good case):

	       char* d = (char*)my_malloc(10);
               // BARRIER GUARD: CHECKS IF THE VALUE IS NULL
		if(d == NULL)
			goto end;
		// GOOD: NULL-CHECKED DEREF (CORRECTLY NOT IDENTIFIED BY THE QUERY)
		use(d);

		end:
		// BAD: POSSIBLE NULL DEREF (CORRECTSLY IDENTIFIED BY THE QUERY)
		use(d);

This style of code (using gotos, exits, or returns) is common, and unfortunately, it's not always as that simple in terms of the control flow. Often the null check (barrier guard) is buried within other conditionals. Here is an example, and due to the nesting of the barrier guard, both the good and bad case are detected as potential null dereference.

		char* d; 
 		if (boolgen()) {
			d = malloc(10);
			if (d == NULL)
			{
				goto end;
			}
		}

		// GOOD: NULL-CHECKED DEREF (INCORRECTLY IDENTIFIED BY THE QUERY)
		use(d);
		
		end:
		// BAD: POSSIBLE NULL DEREF (CORRECTLY IDENTIFIED BY THE QUERY)
		use(d);

Note that in this case it is possible the value is used without initialization, which is its own bug, but that's not what CodeQL is reporting for me query. It is identifying the malloc initialization as the source.

The issue, as far as I can tell, is that the underlying 'controls' logic correctly says the guard does not control the use (since it is nested and not a dominating guard).

Examples like this yield hundreds of false positives in real-world tests, and it's unclear to me what CodeQL paradigm can be used to filter out these cases. Any suggestions?

bdrodes avatar Aug 18 '22 14:08 bdrodes

Anyone have a chance to consider this scenario yet?

bdrodes avatar Aug 23 '22 14:08 bdrodes

I might have a solution here. It's not super pretty, and I haven't done any benchmarking on it yet. Consider this snippet:

bool bad(int);
int source();
void sink(int);

void test(bool b) {
  int x;
  if (b) {
    x = source();
    if (bad(x)) {
      return;
    }
  }
  sink(x);
}

where we don't want to report a flow from source() to sink(x). I hope that's a valid interpretation of your problem.

The following, somewhat ugly, barrier achieves this:

/**
 * @kind path-problem
 */

import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.controlflow.Guards
import semmle.code.cpp.controlflow.DefinitionsAndUses
import DataFlow::PathGraph

class BadCall extends FunctionCall {
  BadCall() { this.getTarget().hasGlobalName("bad") }
}

class Conf extends DataFlow::Configuration {
  Conf() { this = "Conf" }

  override predicate isSource(DataFlow::Node source) {
    source.asExpr().(Call).getTarget().hasName("source")
  }

  override predicate isSink(DataFlow::Node sink) {
    sink.asExpr() = any(Call call | call.getTarget().hasName("sink")).getAnArgument()
  }

  override predicate isBarrier(DataFlow::Node node) {
    exists(VariableAccess use, Variable v |
      use = node.asExpr() and // The node is a use of some variable
      v = use.getTarget() and // and the variable is `v`.
      forex(SsaDefinition def |
        definitionUsePair(v, def, use) // For all possible definitions that might reach the node.
      |
        exists(BadCall badCall, BasicBlock controlled |
          v.getAnAccess() = badCall.getAnArgument() and // There needs to be a call to `bad(v)`
          badCall.(GuardCondition).controls(controlled, true) and // such that the call controls whether we reach `controlled`
          not controlled.getASuccessor+() = use.getBasicBlock() // and controlled does not allow flow to the node
        )
      )
    )
  }
}

from Conf conf, DataFlow::PathNode source, DataFlow::PathNode sink
where conf.hasFlowPath(source, sink)
select sink.getNode(), source, sink, ""

MathiasVP avatar Aug 31 '22 13:08 MathiasVP