analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Cleaning up and documenting region analysis

Open vesalvojdani opened this issue 3 years ago • 4 comments

Simmo fixed some soundness issues (#106) that arise when the region analysis is executed with symbolic equality and field-sensitive refinement for all benchmarks. Previously, these flags were only turned on for specific benchmarks. The pull request fixed all surface issues that emerged, but it did not raise our confidence in the code.

  • [ ] Figure out if we really need general exp rather than lval to be stored for accesses.
  • [ ] The field offset handling is fairly suspect and should be documented. I am not 100% convinced adding field-sensitivity to the region analysis in this way is sound, even for non-overlapping fields.
  • [ ] Eliminate the todos for the offset handling. Why are some offsets dropped and some are handled.
  • [ ] Build the partition domain on top of the Powerset domain.
  • [ ] Unsoundness when a pointer to fresh region escapes to another thread, see https://github.com/goblint/analyzer/pull/641#discussion_r825682972.
  • [ ] Fixpoint error when interprocedural, see https://github.com/goblint/analyzer/commit/16077e3b146d464072004eaf0cd1480a2a7b99d1. Reveals multiple issues:
    • [ ] Narrowing should be no-op instead of meet?
    • [ ] Bullet to bullet assign special case not implemented.
    • [ ] enter doesn't remove outer scope locals.
    • [ ] Interprocedural diamond from region paper not implemented.
    • [ ] Bullet concretization semantics are unclear.
    • [ ] Postsolver verify checks post-solution which isn't necessary with non-monotonicity and narrowing.

The region analysis is very simple on paper, but we have not worked out these interactions with field-sensitivity. We may need a deeper understanding of these things. We take a pointer p into an equivalence class and upon p -> f just appending f to all members of that class. This is sound under certain specific assumptions, but it's not clear that these assumptions are maintained by all operations of the region analysis.

vesalvojdani avatar Oct 02 '20 09:10 vesalvojdani

While doing this, one should probably also look at #66 again.

michael-schwarz avatar Oct 08 '20 16:10 michael-schwarz

Just to document this here: yesterday we discovered region analysis unsoundness in ldv-linux-3.14-races/linux-3.14--drivers--usb--misc--iowarrior.ko.cil.i as well, where we miss a race on the opened field of iowarrior struct, because it appears to have no region (bullet) due to an unknown pointer.

Although the benchmark itself is also questionable, since in one case dev->opened is accessed on a malloced but uninitialized struct, which is UB and shouldn't be in sv-benchmarks to begin with.


I believe the following program demonstrates the same behavior as the iowarrior benchmark:

// PARAM: --set ana.activated[+] region
#include <pthread.h>

struct s {
  int i;
};

struct t {
  struct s *p;
};

void *t_fun(void *arg) {
  struct t *t2 = malloc(sizeof(struct t));
  // struct s *s2; // s2 is not in RegMap
  struct s *s2 = t2->p; // s2 will have bullet from t2 in RegMap
  s2->i = 0; // RACE
  return NULL;
}

int main() {
  pthread_t id;
  pthread_create(&id, NULL, t_fun, NULL);

  struct t *t1 = malloc(sizeof(struct t));
  // struct s *s1; // s1 is not in RegMap
  struct s *s1 = t1->p; // s1 will have bullet from t1 in RegMap
  s1->i = 0; // RACE
  return 0;
}

In terms of the base analysis s1 is an unknown pointer in both cases, but in the region analysis they are handled differently:

  • When uninitialized, it doesn't appear at all in RegMap, so it doesn't have any known region.
  • When explicitly initialized to an unknown pointer from uninitialized memory, it is given a bullet in RegMap. The expression evaluation used by region analysis is surprisingly symbolic and doesn't query other analyses for may-point-to, etc at all. If it did, then it might be possible to drop the bullet assignment when going through uninitialized memory (which region is otherwise unaware of).

sim642 avatar Mar 03 '22 08:03 sim642

As of #641, another example for an unsoundness is 09/38.

michael-schwarz avatar Mar 18 '22 08:03 michael-schwarz

When @karoliineh extracted a simplified non-escape analysis from region, a fixpoint error arose from a ZSTD test we have. I managed to reproduce an analogous fixpoint error with the original region analysis here: https://github.com/goblint/analyzer/commit/16077e3b146d464072004eaf0cd1480a2a7b99d1.

It's not yet clear to me, what the root cause is, but there's odd behavior with interprocedural region analysis. We found out that the region analysis does interprocedurality very naively by just passing the entire outer map into called functions, including all locals of the outer function that are out of scope in the called function. On combine, it just hopes the entire outer function state was passed through to the called function's return node and combine simply takes it from there. So it doesn't work like our other analyses that remove the outer state in enter and add it back in combine.

This on its own shouldn't be incorrect (just inefficient) though. Also, this seems different from how the region analysis paper describes interprocedural analysis.

sim642 avatar Sep 25 '23 10:09 sim642