symcc
symcc copied to clipboard
symcc fails to solve simple equality constraints
The following program reads a long int, assigns the long int to an unsigned long, then branches on whether the unsigned long is equal to the constant 10086
:
#include <unistd.h>
int main() {
long ival;
unsigned long abs_ival;
read(STDIN_FILENO, &ival, sizeof ival);
abs_ival = (unsigned long)ival;
if(abs_ival == 10086) {
return 9;
}
return 10;
}
For this program, symcc is able to generate an input that's equal to 10086
and causes the program to return 9
instead of 10
.
On the other hand, if we add in a conditional branch on the sign of ival, symcc fails to generate a byte sequence that corresponds to the constant 10086
.
#include <unistd.h>
int main() {
long ival;
unsigned long abs_ival;
read(STDIN_FILENO, &ival, sizeof ival);
if (ival < 0) {
abs_ival = 0U - (unsigned long)ival;
}
else {
abs_ival = (unsigned long)ival;
}
if(abs_ival == 10086) {
return 9;
}
return 10;
}
I'm aware that symcc uses optimistic solving under the hood, but this program is simple and the pure_concolic_execution.sh helper script is stuck at a fixed point of 2 inputs, neither of which corresponds to the constant 10086.
We found this issue when trying to symbolically execute python with symcc. The program above is a fragment of Python's long -> big int conversion function.
After investigating this issue further, I found that at -O0 level, the second program actually works.
I looked into the llvm IR of the second program at different optimization levels.
At -O0, there is a snippet that compares abs_ival
to the constant:
%44 = call i8* @_sym_build_integer(i64 10086, i8 64)
%45 = call i8* @_sym_build_equal(i8* %40, i8* %44)
At -O3, I don't see a similar snippet that performs the comparison. That's probably why I don't see the solution of abs_ival == 10086
being generated by symcc.
I recall that according to the paper, symcc inserts its calls to the runtime at an earlier stage of the compilation pipeline so optimization levels shouldn't have a large impact on its behavior. Has that changed in the current implementation?
Here are links to the full LLVM code: -O0: https://gist.github.com/yiyunliu/ae48fbfb0b7fdfc5977ccd7699c07dd0 -O3: https://gist.github.com/yiyunliu/a7e57838150a8f6ca99eff5e98731e2f