catala icon indicating copy to clipboard operation
catala copied to clipboard

Bug in Z3 encoding of pattern-matching

Open denismerigoux opened this issue 2 years ago • 3 comments

When running the command :

 dune exec compiler/catala.exe -- Proof examples/allocations_logement/allocations_logement.catala_fr -s ÉligibilitéAidePersonnelleLogement 

on the branch allocations_logement, we get the following error message:

catala: internal error, uncaught exception:
        Z3.Error("Sort mismatch at argument #2 for function (declare-fun => (Bool Bool) Bool) supplied sort is Real")
        Raised by primitive operation at Z3.Boolean.mk_implies in file "../src/api/ml/z3.ml" (inlined), line 573, characters 19-38
        Called from Verification__Z3backend.translate_expr.(fun) in file "compiler/verification/z3backend.real.ml", line 732, characters 12-50

This error comes from the encoding in Z3 of the following expression:

[ERROR] Encoding error!                

   --> examples/allocations_logement/code_construction_legislatif.catala_fr
    | 
166 |     (selon ménage.logement.propriétaire sous forme
    |      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
167 |       -- DemandeurOuConjointOuParentOuViaPartsSociétés de parts: parts
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
168 |       -- ParentOuAutre.Autre: 0,0) <. seuil_l822_3_parts_propriété et
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    + Code de la construction et de l'habitation
    +-+ Partie législative
      +-+ Livre VIII : Aides personnelles au logement
        +-+ Titre II : Dispositions communes aux aides personnelles au logement
          +-+ Chapitre II : Conditions générales d'attribution
            +-+ Article L822-3

More precisely, the arms of this match are encoded as follow:

[DEBUG] Arm:
λ (parts_2327: decimal) → parts_2327
Z3 Arm:
(|DemandeurOuConjointOuParentOuViaPartsSociétés!0|
  (|propriétaire| (logement |ménage_979|)))
[DEBUG] Arm:
λ (__2328: unit) → 0.
Z3 Arm:
0.0

It seems like the encoding of the first arm, which is the identity function, is broken since it speaks about totally different things in Z3. Maybe a context problem ? It's likely that the bug comes from

https://github.com/CatalaLang/catala/blob/6ca133003e6a48a5e5d69cde7025c8aa2fe4388f/compiler/verification/z3backend.real.ml#L632-L654

denismerigoux avatar Mar 31 '22 09:03 denismerigoux

Note: I was unable to reproduce in a MWE with a similar enumeration but without a complicated context around.

denismerigoux avatar Mar 31 '22 09:03 denismerigoux

Thanks for the heads-up about this. The issue stems from an incorrect invariant that we assumed for the Z3 generation: We previously thought that any EMatch would be returning a boolean, i.e., pattern-matching in VCs would always return booleans in branches.

The behaviour observed here shows this is not always the case, in particular, you can have something like

(match x with
| A (y:decimal) -> y
| B (_:unit) -> 0) <. 0

in a VC, where some internal value is defined through pattern-matching.

After discussion with @denismerigoux, there are two possibilities to tackle this issue.

  1. Inline all terms inside match arms during VC generation, to ensure that a VC arm always return a boolean. We decided agaisnt this solution, as the size of the VC might explode when doing so, and lead to high code-duplication.
  2. Encode a match result as a temporary result variable, and propagate this variable. For instance, in the example above, we would say that tmp!match_z3 is equal to match x with | A (y:decimal) -> y | B (_:unit) -> 0, and translate this term into Z3 by tmp!match_z3. This solution avoids the need for trusted modifications during VC generation and is also more generic. A PoC is implemented in the afromher_verification branch. The main issue with this solution is that, to define a Z3 variable corresponding to the match, we need to have access to the expected type of the EMatch expression. Currently exposed functions in DCalc.Typing (i.e., infer_type), do not handle free variables, which is a problem when considering dcalc subterms (for instance, changes on the afromher_verification branch lead to failures in some of the existing tests for this reason). @denismerigoux mentioned that this is related to the discussion in #152. Concretely, we would need to embed the type of each subexpression inside the AST itself. @AltGr, is it something you might be able to look at?

R1kM avatar May 19 '22 23:05 R1kM

@R1kM you can take another look at this, #152 is closed!

denismerigoux avatar Jul 29 '22 16:07 denismerigoux