esbmc icon indicating copy to clipboard operation
esbmc copied to clipboard

Add type assertions when creating irep2s

Open mikhailramalho opened this issue 4 years ago • 7 comments

We are currently generating a number of irep/irep2 expr's where the operands have different types. The SMT solvers don't like it, and even the C standard tries to avoid it by introducing type promotions.

The problem causes bugs like #148, where get_ulong is used to generate constants and will generate conflicting types in a number of operations depending on the arch (32 bits vs 64 bits).

We should add assertions to irep2's constructors, so our regressions can catch these problems. Something like:

From c1a8540907402a890a4174e948ce9b580719c2e8 Mon Sep 17 00:00:00 2001
From: "Mikhail R. Gadelha" <[email protected]>
Date: Thu, 11 Jun 2020 19:07:47 -0400
Subject: [PATCH] Irep2 assertions

Signed-off-by: Mikhail R. Gadelha <[email protected]>
---
 src/util/irep2_expr.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/src/util/irep2_expr.h b/src/util/irep2_expr.h
index fe0487421..837ba9619 100644
--- a/src/util/irep2_expr.h
+++ b/src/util/irep2_expr.h
@@ -307,6 +307,20 @@ public:
     const expr2tc &s2)
     : expr2t(t, id), side_1(s1), side_2(s2)
   {
+    if(is_pointer_type(s1) && is_pointer_type(s2))
+    {
+      assert(s1->type->get_width() == s2->type->get_width());
+    }
+    else
+    {
+      try
+      {
+        assert(s1->type == s2->type);
+      }
+      catch(array_type2t::inf_sized_array_excp *e)
+      {
+      }
+    }
   }
   relation_data(const relation_data &ref) = default;
 
@@ -456,6 +470,7 @@ public:
     const expr2tc &v2)
     : arith_ops(t, id), side_1(v1), side_2(v2)
   {
+    assert(v1->type == v2->type);
   }
   arith_2ops(const arith_2ops &ref) = default;
 
@@ -506,6 +521,7 @@ public:
     const expr2tc &rm)
     : arith_ops(t, id), rounding_mode(rm), side_1(v1), side_2(v2)
   {
+    assert(v1->type == v2->type);
   }
   ieee_arith_2ops(const ieee_arith_2ops &ref) = default;
 
@@ -539,6 +555,8 @@ public:
     const expr2tc &rm)
     : arith_ops(t, id), rounding_mode(rm), value_1(v1), value_2(v2), value_3(v3)
   {
+    assert(v1->type->get_width() == v2->type->get_width());
+    assert(v2->type->get_width() == v3->type->get_width());
   }
   ieee_arith_3ops(const ieee_arith_3ops &ref) = default;
 
-- 
2.25.1

While I'm not sure about the pointer special case in class relation_data, just running our regression with this patch already triggers several errors.

Most of them are not causing problems because they are being either sliced away or simplified away, but we shouldn't be creating these wrong expressions in the first place.

mikhailramalho avatar Jun 11 '20 23:06 mikhailramalho

I've created the branch irep2-assertions with assertions to arithmetic and relational operations, we should add some more:

  • true and false branches of a ternary-if should have the same type, the condition should be boolean.
  • exponent and significand of ieee_floatt should be equal to the ones stored in the type. I'm not sure about fixedbvt.
  • The number of members in a constant_datatype_data should be equal to the one stored in the type.
  • operands in a logical op should be booleans.
  • operands in bitops should be bitvectors.
  • type of operand in arith_1op should be equal to the one stored in the type.
  • index field of index2t should have index_type2() type.

mikhailramalho avatar Jun 12 '20 00:06 mikhailramalho

Stale issue message

github-actions[bot] avatar Aug 11 '20 00:08 github-actions[bot]

@fbrausse: This issue might be interest to you :-)

lucasccordeiro avatar Jan 16 '22 15:01 lucasccordeiro

Doesn't this approach assume that expressions are immutable? What are downsides to having them immutable?

fbrausse avatar Jan 26 '22 14:01 fbrausse

yeah, irep2 was designed to be immutable: for instance, stuff like simplification always generates new objects. We do have some things that are lazily instantiated tho, like the hash/crc.

I think the biggest disadvantage is the increased memory usage. We had some strategies in place to try to reduce that (like the type pools to return types that were already generated), but because of the migrate layer we have between the frontend and the rest of esbmc, they were never stable enough and were removed.

(Also, note that guardt should be an irep2 but it was never migrated.)

mikhailramalho avatar Jan 26 '22 15:01 mikhailramalho

Many thanks! I've tried to make constant_struct2t immutable. For efficiency reasons it would be good for its constructor to take a std::vector<expr2tc> &&members instead of the const reference it takes now. For that I needed to make the something2tc(Args...) constructor truly forwarding (by (Args &&...) and std::forward), but that breaks in lots of places all over. Maybe we should first make irep2 constructible by rvalue references?

fbrausse avatar Jan 26 '22 15:01 fbrausse

I would add "unsigned array sizes" to the above list.

https://github.com/esbmc/esbmc/pull/619#issuecomment-1037063626 might be relevant:

I don't see signed array sizes having any valid use case in expr2; is there any? Enforcing unsigned array sizes breaks in a lot of other regression tests though [...]

fbrausse avatar Feb 16 '22 15:02 fbrausse