HOL FMA error in floating-point (?)

FMA error in floating-point (?)

Open mn200 opened this issue 2 years ago • 16 comments

See, this from the Isabelle mailing list

It is possible that this formalisation is not faithful to the IEEE standard. The commit message of the changeset that is to blame reminded me that I just “ported float_mul_add and float_round … from HOL Kananaskis-12”. I.e., I did not validate these definitions thoroughly.

The corresponding code in HOL Kananaskis-12 (pretty much unchanged in current Kananaskis-14) reads as follows and probably shows the same behaviour: https://github.com/HOL-Theorem-Prover/HOL/blob/kananaskis-12/src/floating-point/binary_ieeeScript.sml#L488

         float_round_with_flags mode
           (if (r1 = 0) /\ (r2 = 0) /\ (signP = z.Sign) then
              signP = 1w
            else mode = roundTowardNegative) (r1 + r2)`

Mar 25 '22 09:03 mn200

At this point, I don't yet have an opinion on whether the HOL4 defintion is right or not. But I do have a reason for caring about it.

We're testing CPU designs by fuzzing them against formal models, and the IEEE 754 model in HOL4 is in effect the "gold model" things are ultimately being compared against.

The CHERI MIPS soft-core doesn't implement FMA, so doesn't encounter this issue. (Not implementing FMA was a deliberate design choice, influenced by concerns about gate count and the added complexity of instructions with 4 operands in the floating point pipeline).

In any case, the MIPS specification for FMA is different from IEEE 754. (Take a look at the L3 model of MIPS floating point, or see the MIPS ISA specification).

We are seeing different behaviour of the FMA instruction between the "spike" RISC-V emulator and the Piccolo RISC-V soft-core (found by tandem execution of a circuit simulation of the core against the emulator). Now, the way we decide which is right in case of disagreement boils down to execution of the HOL4 model in the logic (plus going and re-reading the IEEE 754 specification); at which point, we start to care about whether the HOL4 model of FMA is right.

Another thing we could do is to execute test cases created by IBM's FPGen formal model against the HOL4 model

Jun 27 '22 11:06 michael-roe

I'm happy to be advised by those more knowledgeable than I. For the moment, it sounds as if this still up in the air, but I'm happy with the prospect of pull requests in my future.

Jun 28 '22 01:06 mn200

Interesting, one of the test cases we get by tandem execution of a CPU against a simulator his something close to this: nmsub.s of all (positive) zeros on RISC-V.

Jul 16 '22 10:07 michael-roe

Ok, so I now have a program that will take test cases generated from IBM;s FPGEN formal model and turn them into theorems for HOL4 to prove.

I am not at present testing the sign of zero, but all the tests of fused multiply-add passed. (So HOL4's model agrees with IBM model on the test cases, except possibly for the sign of zero which I'm not testing).

Jul 22 '22 17:07 michael-roe

Good to hear! Thanks.

Jul 23 '22 03:07 mn200

However, see also: http://isabelle.systems/zulip-archive/stream/247541-Mirror:-Isabelle-Users-Mailing-List/topic/.5Bisabelle.5D.20Wrong.20semantics.20for.20fmul_add.20in.20afp.2FIEEE_Float.2E.2E.2E.html which suggests that the behaviour exercising them was indeed the sign of the zeroes.

Jul 25 '22 01:07 mn200

Thanks for the reply!

Checking the sign of the zeros is what I'm going to do next. I agree that the sign of zero is (probably) where the bug is.

Jul 25 '22 10:07 michael-roe

Ok, I now have some failing test cases.

IBM's FPgen has: b32*+ =0 i +1.000000P-126 -1.2801D9P-113 -Zero -> -Zero xu

but ... val f1 = <| Sign := 0w; Exponent := 1w; Significand := 0x0w |> : (23,8)float; val f2 = <| Sign := 1w; Exponent := 14w; Significand := 0x2801d9w |> : (23,8)float; val f3 = <| Sign := 1w; Exponent := 0w; Significand := 0x0w |> : (23,8)float; val expected = <| Sign := 1w; Exponent := 0w; Significand := 0x0w |> : (23,8)float; EVAL 1w = (SND(float_mul_add roundTiesToEven ^f1 ^f2 ^f3)).Sign;

Jul 25 '22 11:07 michael-roe

Seems like the expression in the quoted definition in the first post might need changing.

Jul 26 '22 02:07 mn200

The new/fixed Isabelle code (from the development version of the AFP) has:

definition fmul_add :: "roundmode ⇒ ('t ,'w) float ⇒ ('t ,'w) float ⇒ ('t ,'w) float ⇒ ('t ,'w) float"
  where "fmul_add mode x y z = (let 
    signP = if sign x = sign y then 0 else 1;
    infP = is_infinity x  ∨ is_infinity y
  in
    if is_nan x ∨ is_nan y ∨ is_nan z then some_nan
    else if is_infinity x ∧ is_zero y ∨
           is_zero x ∧ is_infinity y ∨
           is_infinity z ∧ infP ∧ signP ≠ sign z 
      then some_nan
    else if is_infinity z ∧ (sign z = 0) ∨ infP ∧ (signP = 0)
      then plus_infinity
    else if is_infinity z ∧ (sign z = 1) ∨ infP ∧ (signP = 1)
      then minus_infinity
    else let 
      r1 = valof x * valof y;
      r2 = valof z;
      r = r1+r2
    in 
      if r=0 then ( ― ‹Exact Zero Case. Same sign rules as for add apply. ›
        if r1=0 ∧ r2=0 ∧ signP=sign z then zerosign signP 0
        else if mode = To_ninfinity then -0
        else 0
      ) else ( ― ‹Not exactly zero: Rounding has sign of exact value, even if rounded val is zero›
        zerosign (if r<0 then 1 else 0) (round mode r)
      )
  )"

where zerosign is defined:

definition zerosign :: "nat ⇒ ('e ,'f) float ⇒ ('e ,'f) float"
  where "zerosign s a =
    (if is_zero a then (if s = 0 then 0 else - 0) else a)"

HOL4 would write POS0 or NEG0 inside the nested if.

Jul 26 '22 02:07 mn200

The new Isabelle code looks right to me.

Jul 26 '22 06:07 michael-roe

Do you want to have a try at fixing this bug? I can test it if someone else proposes a fix.

Jul 26 '22 08:07 michael-roe

I’ll give it a go.

Jul 26 '22 23:07 mn200

So, this bug is HOL4's floating point is not very likely to matter for real programs, but here is a moderately realistic program (in Standard ML) where it matters:

fun dot_product (sum, _, []) = sum | dot_product(sum, [], _) = sum | dot_product (sum, x::xs, y::ys) = dot_product(Real.*+(x, y, sum), xs, ys);

A theorem you might want to prove is that if dot_product(~0.0, a, b) = ~0.0, then the dot product (computed to arbitrary precision, without rounding) is negative. (If the dot_product is positive 0.0, then nothing can be said).

I think this is what the feature in IEEE 754 is for ... if you have an underflow in computing the dot product, you don't get to know its value even approximately, but you can know that it was negative.

N.B: There's no need to waste time actually trying to prove this theorem; this is just by way of justifying why that case is there in the IEEE 754 specification.

P.S. In CakeML, it looks something like this:

fun dot_product sum a b = case a of [] => sum | x::xs => (case b of [] => sum | y::ys => dot_product (Double.fma(x y sum)) xs ys);

Jul 29 '22 16:07 michael-roe

I've made what I think is the appropriate update to the definition of float_mul_add.

Is there an analogous update that should be made to float_mul_sub?

I've included your test-case from above, but if you have any more in a similar vein, I'd be happy to include them.

Aug 09 '22 00:08 mn200

I have just run the fused-multiply-add tests from IBM's FPgen against your revised float_mul_add, and all the tests now pass. So I think we have fixed the bug in float_mul_add.

Yes, we probably should make the analogous update to float_mul_sub too.

Aug 09 '22 10:08 michael-roe

HOL HOL copied to clipboard

FMA error in floating-point (?)

HOL
HOL copied to clipboard