analyzer
analyzer copied to clipboard
Unsound address set must-not-equality check in base
The use of AD.meet
to check address set must-not-equality:
https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/src/analyses/base.ml#L331-L334
is unsound in at least the following cases:
- [ ] First struct field and zero index: https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/tests/regression/02-base/91-ad-meet.c#L6-L16
- [ ] Union fields: https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/tests/regression/02-base/92-ad-union-fields.c#L6-L17
- [ ] Non-first struct field and non-zero index with alignment: https://github.com/goblint/analyzer/issues/822#issuecomment-1212175339.
For both programs we report that the assert definitely fails, but when compiled and run, it succeeds.
Using meet
would only work if addresses were represented in their canonical form using a byte offset from the variable.
The unsoundness of comparing addresses with a struct field and an index as offsets is also exhibited for non-zero offsets:
https://github.com/goblint/analyzer/blob/712c6cef810aa3c6635a677bc9eef7dd8e7cc599/tests/regression/02-base/93-ad-struct-offset.c#L9-L27
The alignment of fields within objects is implementation defined, thus one generally cannot produce definite answers when comparing integer offsets with field offsets on structs.
Since #809 makes this problem even more urgent, we probably need a quicker fix than completely refactoring lvalue representations to carry a canonical form or whatnot. A simpler implementation for just AD.may_equal a1 a2
would be to pairwise check all lvalues from a1
and a2
, and check their may equality. That could be implemented by calculating the canonical offset numerically, a la Cil.bitsOffset
and checking those integers for may equality. The only problem is that Cil.bitsOffset
works only with constant indices, but we'll have abstract ones, so it'll have to be partially reimplemented in Goblint. However, CIL can still provide us with the byte/bit offsets of struct fields (using Machdep
), so it shouldn't be too difficult.
A remark from Zulip on optimizing the pairwise may alias checking. Checking all pairs is overly wasteful because addresses may only alias if their varinfo
s (without offset
s) are equal, so it suffices to do pairwise checking within each varinfo
.
The projective bucketing from #809 doesn't directly allow that because the ProjectiveSet
buckets are per varinfo
and representative offset. However, that could be replaced with a two nested ProjectiveSet
s: outer split by varinfo
only and inner split by representative offset in each varinfo
. Then may alias checking can be performed pairwise within each outer bucket, but over all the elements in the corresponding inner buckets.
Currently ProjectiveSet
completely abstracts away the map structure, so an additional special function (maybe bucket_exists2
?) would have to be implemented to do that.