analyzer
analyzer copied to clipboard
Cleaning up and documenting region analysis
Simmo fixed some soundness issues (#106) that arise when the region analysis is executed with symbolic equality and field-sensitive refinement for all benchmarks. Previously, these flags were only turned on for specific benchmarks. The pull request fixed all surface issues that emerged, but it did not raise our confidence in the code.
- [ ] Figure out if we really need general
exp
rather thanlval
to be stored for accesses. - [ ] The field offset handling is fairly suspect and should be documented. I am not 100% convinced adding field-sensitivity to the region analysis in this way is sound, even for non-overlapping fields.
- [ ] Eliminate the todos for the offset handling. Why are some offsets dropped and some are handled.
- [ ] Build the partition domain on top of the Powerset domain.
- [ ] Unsoundness when a pointer to fresh region escapes to another thread, see https://github.com/goblint/analyzer/pull/641#discussion_r825682972.
- [ ] Fixpoint error when interprocedural, see https://github.com/goblint/analyzer/commit/16077e3b146d464072004eaf0cd1480a2a7b99d1.
Reveals multiple issues:
- [ ] Narrowing should be no-op instead of meet?
- [ ] Bullet to bullet
assign
special case not implemented. - [ ]
enter
doesn't remove outer scope locals. - [ ] Interprocedural diamond from region paper not implemented.
- [ ] Bullet concretization semantics are unclear.
- [ ] Postsolver verify checks post-solution which isn't necessary with non-monotonicity and narrowing.
The region analysis is very simple on paper, but we have not worked out these interactions with field-sensitivity. We may need a deeper understanding of these things. We take a pointer p
into an equivalence class and upon p -> f
just appending f
to all members of that class. This is sound under certain specific assumptions, but it's not clear that these assumptions are maintained by all operations of the region analysis.
While doing this, one should probably also look at #66 again.
Just to document this here: yesterday we discovered region analysis unsoundness in ldv-linux-3.14-races/linux-3.14--drivers--usb--misc--iowarrior.ko.cil.i
as well, where we miss a race on the opened
field of iowarrior
struct, because it appears to have no region
(bullet) due to an unknown pointer.
Although the benchmark itself is also questionable, since in one case dev->opened
is accessed on a malloc
ed but uninitialized struct, which is UB and shouldn't be in sv-benchmarks to begin with.
I believe the following program demonstrates the same behavior as the iowarrior benchmark:
// PARAM: --set ana.activated[+] region
#include <pthread.h>
struct s {
int i;
};
struct t {
struct s *p;
};
void *t_fun(void *arg) {
struct t *t2 = malloc(sizeof(struct t));
// struct s *s2; // s2 is not in RegMap
struct s *s2 = t2->p; // s2 will have bullet from t2 in RegMap
s2->i = 0; // RACE
return NULL;
}
int main() {
pthread_t id;
pthread_create(&id, NULL, t_fun, NULL);
struct t *t1 = malloc(sizeof(struct t));
// struct s *s1; // s1 is not in RegMap
struct s *s1 = t1->p; // s1 will have bullet from t1 in RegMap
s1->i = 0; // RACE
return 0;
}
In terms of the base analysis s1
is an unknown pointer in both cases, but in the region analysis they are handled differently:
- When uninitialized, it doesn't appear at all in
RegMap
, so it doesn't have any known region. - When explicitly initialized to an unknown pointer from uninitialized memory, it is given a bullet in
RegMap
. The expression evaluation used by region analysis is surprisingly symbolic and doesn't query other analyses for may-point-to, etc at all. If it did, then it might be possible to drop the bullet assignment when going through uninitialized memory (which region is otherwise unaware of).
As of #641, another example for an unsoundness is 09/38
.
When @karoliineh extracted a simplified non-escape analysis from region, a fixpoint error arose from a ZSTD test we have. I managed to reproduce an analogous fixpoint error with the original region analysis here: https://github.com/goblint/analyzer/commit/16077e3b146d464072004eaf0cd1480a2a7b99d1.
It's not yet clear to me, what the root cause is, but there's odd behavior with interprocedural region analysis. We found out that the region analysis does interprocedurality very naively by just passing the entire outer map into called functions, including all locals of the outer function that are out of scope in the called function. On combine
, it just hopes the entire outer function state was passed through to the called function's return node and combine
simply takes it from there. So it doesn't work like our other analyses that remove the outer state in enter
and add it back in combine
.
This on its own shouldn't be incorrect (just inefficient) though. Also, this seems different from how the region analysis paper describes interprocedural analysis.