analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Enhance detection of buffer overflows by introducing variable for length of array/blob

Open michael-schwarz opened this issue 3 years ago • 4 comments

A possible direction that we could go in to convince people that our analysis of globals is useful would be to target the absence of buffer overflows (our if we want to go for something more sophisticated, the absence of buffer overflows in string manipulation).

A critical ability there would be to relate the length of a buffer with variables used to access things inside the buffer. Currently, this only works for the non-relational case.

int readUntil(char arr[], int len) {
	for(int i=0;i < len;i++) {
		char s  = arr[i];
	}
}

int main() {
	int len;
	int top;

	if(top) {
		len = 5;
	} else {
		len = 10;
	}

	// char* ptr = malloc(sizeof(char)*len);
	char ptr[len];
	readUntil(ptr, len);
}

With --enable ana.arrayoob, we warn at the access in readUntil because we do not know that the length of arr is len here (and also, for some reason, when we call readUntil in main(?)). Also, we currently do not warn at all for access of buffers created by malloc, which is why I used the VLA here.

TODOs:

  • [ ] Make out of bounds check apply equally for dynamically allocated memory
  • [ ] Track symbolic relationships between the length of an array and programs variables (just using the relationship between len and the expression used to access the array would not suffice as that could be reassigned later)
    • [ ] Ghost variable len$array$ that is set to len on malloc and then automatically tracks the length
  • [ ] Figure out how (if) this would work for globals (which would be the case we actually care about)

michael-schwarz avatar Mar 03 '22 07:03 michael-schwarz

  • Ghost variable len$array$ that is set to len on malloc and then automatically tracks the length

How do you imagine this working? That there's a pre-transformation of the AST to introduce these such that a relational analysis could track that?

sim642 avatar Mar 03 '22 08:03 sim642

How do you imagine this working? That there's a pre-transformation of the AST to introduce these such that a relational analysis could track that?

Maybe. The nicer thing would be to do it without transformations to the source code at the call to malloc and at the VarDecl for a VLA. That way, one would have distinct ghost variables for things that our heap analysis creates distinct things for, which may be helpful... But I need to think it through to see if there are some further issues with this idea.

michael-schwarz avatar Mar 03 '22 08:03 michael-schwarz

A student of @jerhard is working on a version of this.

michael-schwarz avatar Nov 29 '23 10:11 michael-schwarz

By the way, this is one of the selling points for Mopsa. For example, see their SV-COMP 2023 tool paper or Monat's PhD thesis. Although their solution is significantly more elegant because their analyzer design is specifically made to allow such things in an extremely modular way.

sim642 avatar Nov 29 '23 10:11 sim642