sv-benchmarks
sv-benchmarks copied to clipboard
Strict aliasing violations in SVCOMP benchmarks?
Hey guys,
I got very interested in the strict aliasing rule recently and am curious if any of SVCOMP benchmarks violates it. It seems that some benchmarks actually do. For example, in line ntdrivers/parport_true-unreach-call.i.cil.c
, PptBuildResourceList
is called with the third argument being ChipAddr
of type PUCHAR*
while the declared type of the third argument is PULONG
. I think there may be strict aliasing violation here. The following is the simple experiment I did,
typedef unsigned long ULONG;
typedef ULONG *PULONG;
typedef unsigned char UCHAR;
typedef UCHAR *PUCHAR;
#include <stdio.h>
void foo(PULONG x, PUCHAR y[])
{
*x = 0;
*y = (UCHAR *)1;
printf("%lu\n", *x);
}
int main()
{
PUCHAR ChipAddr[4];
ChipAddr[0] = (UCHAR *)920;
ChipAddr[1] = (UCHAR *)622;
ChipAddr[2] = (UCHAR *)348;
ChipAddr[3] = (UCHAR *)46;
foo((PULONG)ChipAddr, ChipAddr);
return 0;
}
$ gcc test1.c -O0 && ./a.out
$ 1
$ gcc test1.c -O2 && ./a.out
$ 0
So the compiler seems to take advantage of strict aliasing rule in this example.
I think the purpose of this issue is just to raise more awareness of undefined behaviors and discussions about it. I still don not understand strict aliasing completely. So feedback is very appreciated.
I do not think that undefined behaviour is at play here. The assumption of strict aliasing enables additional (unsound) assumptions in the alias analysis used by GCC to build SSA form. Thus the compiler will be mis-optimizing the given code.
We surely can opt to agree to make all benchmarks adhere to strict aliasing conventions, but I'd rather see the opposite to be of interest: software verification tools should be able to find bugs even when strict aliasing is violated.
I agree with @tautschnig. In particular, as far as I understand the C standard, using a char pointer to access individual bytes of some object is legal.
@tautschnig @PhilippWendler I think my point is that there is undefined behavior in both the toy example and the benchmark. C standard says,
[C99, 6.5:7] An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:76)
– a type compatible with the effective type of the object,
– a qualified version of a type compatible with the effective type of the
object,
– a type that is the signed or unsigned type corresponding to the effective
type of the object,
– a type that is the signed or unsigned type corresponding to a qualified
version of the effective type of the object,
– an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a subaggregate
or contained union), or
– a character type.
It seems they violated these rules. As far as I see and also pointed out by @tautschnig, strict aliasing is an assumption used by type-based analysis based on these C standard rules. So the compiler, I'm afraid, is doing the correct thing here although it may contradict to the programmer's will.
So let's say given the fact there is such undefined behavior, should we distinguish it from other undefined behaviors such as signed integer overflow and agree a model for type punning, etc.?
In my opinion we first need to get rid of all the Cil-processed benchmarks. These are all affected by the kind of behaviour you are pointing out, as the pointers and (long) integers are being converted back and forth. We might then end up with benchmarks where the behaviour pointed out occurs in the original form of the benchmark, making it much more interesting I'd say.
@tautschnig Interesting. Would it be possible for you to point out an example of the incorrectly processed files?
I have not claimed they are incorrectly processed - they have just been processed by Cil with settings such as to rewrite various access paths into arithmetic over a mix of pointers and integers. Really what every compiler would do when translating to assembly. But obviously that loses type safety. A rather arbitrary example is this one.
@tautschnig Thanks for offering an example. This is indeed interesting.
What is the current state of this? @shaobo-he Do you still think that this program has undefined behavior? Can you explain precisely why and where?
Otherwise I think this issue should be closed.