analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Fix EvalAssert transformation to work on coreutil programs

Open jerhard opened this issue 3 years ago • 6 comments

The EvalAssert transformation is currently not able to produce C-output for the bench/coreutil programs, caused by several problems.

Todo:

  • [x] Fix crash that occurs because some globals are not found by during the transformation #852
  • [x] Add declarations of functions before function definitions in the Cil-file that is produced. This way, it is ensured that functions are declared, before references to them are made in asserts (e.g. in asserts about function pointers).
  • [x] ~Fix generation of asserts about pointers to malloced-memory. With malloced memory, there are (tmp-)variables of type void* that point to it. The assertions that are currently generated for these variables are not type correct, e.g. because a field offset is applied to a de-referenced void-pointer.~ Avoid generating invariants for pointers to abstract values of type TVoid for now.
  • [x] Fix error when running gcc on the output of the transformation generated for cp_comb.c: extent-scan.c:107:18: error: ‘struct fiemap’ has no member named ‘f’. Fix generation of invariants for unions.
  • [x] Add option to not generate asserts for blobs.

Testing this branch on bench/coreutil programs

Run goblint on the selected input file, e.g. cp_comb.c

./goblint  --set pre.cppflags[+] "--std=gnu89" --trans.activated '["assert"]' ../bench/coreutils/cp_comb.c

This will generate a transformed.c. One can check whether this compiles as follows:

gcc transformed.c  -lrt --std=gnu89

jerhard avatar Oct 13 '22 13:10 jerhard

  • Fix generation of asserts about pointers to malloced-memory. With malloced memory, there are (tmp-)variables of type void* that point to it. The assertions that are currently generated for these variables are not type correct, e.g. because a field offset is applied to a de-referenced void-pointer.

I didn't check, but this sounds a lot like something I fixed in #796: https://github.com/goblint/analyzer/commit/73df4cac9812c65a3446c7ea53770fa6d45d59d6. If that's the right thing, then you can just cherry pick it.

sim642 avatar Oct 13 '22 13:10 sim642

I didn't check, but this sounds a lot like something I fixed in https://github.com/goblint/analyzer/pull/796: https://github.com/goblint/analyzer/commit/73df4cac9812c65a3446c7ea53770fa6d45d59d6. If that's the right thing, then you can just cherry pick it.

I needed to additionally add a check whether the pointed to abstract block has type TVoid, because otherwise it would still generate invariants without necessary casts. In the future one may want to look at the abstract value more closely, e.g. by inferring from the fields of a Struct which type it has. Then one could also generate invariants for these cases.

jerhard avatar Oct 13 '22 15:10 jerhard

I tested this with bench/cp_comb.c: when one sets the surroundByAtomic in EvalAssert to false, and adds an #include<assert.h> to the resulting transformed.c, the C-file can be compiled.

However, the executable one obtains crashes with a failed assertion:

a.out: hash.c:606: hash_initialize: Assertion `table___0->tuning == & default_tuning' failed.
Aborted (core dumped)

jerhard avatar Oct 14 '22 09:10 jerhard

However, the executable one obtains crashes with a failed assertion:

a.out: hash.c:606: hash_initialize: Assertion `table___0->tuning == & default_tuning' failed.
Aborted (core dumped)

The code is this:

#line 605
  tmp = malloc(sizeof(*table___0));
#line 605
  table___0 = (Hash_table *)tmp;
#line 606
  if ((unsigned long )table___0 == (unsigned long )((void *)0)) {
#line 607
    return ((Hash_table *)((void *)0));
  }
#line 609
  if (! tuning) {
#line 610
    tuning = & default_tuning;
  }

and the tuning argument actually always is a NULL pointer.

I think the issue is due to flow-insensitivity of the global alloc variable, which initially has all bot fields, but then gets default_tuning in there. But we don't distinguish at which point the initialization has happened, so we think the assertion about the global should also hold before. This is quite an issue that we haven't considered before regarding the invariant generation.

sim642 avatar Oct 14 '22 11:10 sim642

I think the issue is due to flow-insensitivity of the global alloc variable, which initially has all bot fields, but then gets default_tuning in there. But we don't distinguish at which point the initialization has happened, so we think the assertion about the global should also hold before. This is quite an issue that we haven't considered before regarding the invariant generation.

Yes, indeed, we came to the same conclusion on the Munich side with @michael-schwarz. For the general case, involving possible multithreading, one would have to perform a must-initialized analysis to ensure that any abstract value we have for globals isn't "shadowing" a possible uninitialized value.

Since at least cp_comb is a single-threaded program and we assume that no threads are created, I reran the transformation with --disable sem.unknown_function.spawn. This should hopefully be sufficient to avoid that the analysis enters the multithreaded mode. When compiling the transformed code and running the resulting binary, the program then does crash with a seg_fault, instead of a failed assertion. There may be some assertion that dereferences some pointer that is not set properly.

jerhard avatar Oct 14 '22 11:10 jerhard

Looking at this with @michael-schwarz and Helmut, we found that performing a context-insensitive analysis and not generating asserts for blobs gets rid of the asserts that fail during some simple executions of cp.

jerhard avatar Oct 14 '22 16:10 jerhard

Something else comes to mind here: Do we need to submit Goblint as a validator as well in order to benefit from us being able to solve such huge tasks in sv-comp?

Otherwise our ability to reverify these will not benefit us when it comes to points.

michael-schwarz avatar Oct 23 '22 13:10 michael-schwarz

That's a good point, but it's too late to do anything about it for this year. We don't have legitimate witness validation support ready (and nor are YAML witnesses as a whole). We can't just use Goblint as a validator, which completely ignores any witness and solves from scratch, because it constitutes as blatant cheating: we would self-confirm all our own witnesses without actually confirming anything. CPAchecker and Ultimate have well-defined and -documented invariant generation subsystems that they disable for validation purposes, we don't. And this is supported by the fact that they have unconfirmed witnesses.

We should just focus on getting these benchmarks fixed and merged first. The plan is to publish a non-SV-COMP-tool-paper on a proper approach to witness validation with Goblint and bring that to SV-COMP legitimately, like it was done with the existing GraphML witnesses.

sim642 avatar Oct 24 '22 08:10 sim642

There are still three issues in the generated code that required manual fixing:

  • e.g. here, asserts are generated for table -- which is a type, not a variable.
  • e.g. here, the introduction of additional function declarations lead to compiler warnings. Since in other places the introduction of additional function declarations was necessary, I would keep it as is, for now.
  • here, a coreutil specific issue: I had to replace the type definition of gl_recursive_lock_t with pthread_mutex_t, as the code contained (probably) an old (now incompatible) definition.

jerhard avatar Oct 25 '22 15:10 jerhard