HOL
HOL copied to clipboard
FMA error in floating-point (?)
See, this from the Isabelle mailing list
It is possible that this formalisation is not faithful to the IEEE standard. The commit message of the changeset that is to blame reminded me that I just “ported float_mul_add and float_round … from HOL Kananaskis-12”. I.e., I did not validate these definitions thoroughly.
The corresponding code in HOL Kananaskis-12 (pretty much unchanged in current Kananaskis-14) reads as follows and probably shows the same behaviour: https://github.com/HOL-Theorem-Prover/HOL/blob/kananaskis-12/src/floating-point/binary_ieeeScript.sml#L488
float_round_with_flags mode
(if (r1 = 0) /\ (r2 = 0) /\ (signP = z.Sign) then
signP = 1w
else mode = roundTowardNegative) (r1 + r2)`
At this point, I don't yet have an opinion on whether the HOL4 defintion is right or not. But I do have a reason for caring about it.
We're testing CPU designs by fuzzing them against formal models, and the IEEE 754 model in HOL4 is in effect the "gold model" things are ultimately being compared against.
The CHERI MIPS soft-core doesn't implement FMA, so doesn't encounter this issue. (Not implementing FMA was a deliberate design choice, influenced by concerns about gate count and the added complexity of instructions with 4 operands in the floating point pipeline).
In any case, the MIPS specification for FMA is different from IEEE 754. (Take a look at the L3 model of MIPS floating point, or see the MIPS ISA specification).
We are seeing different behaviour of the FMA instruction between the "spike" RISC-V emulator and the Piccolo RISC-V soft-core (found by tandem execution of a circuit simulation of the core against the emulator). Now, the way we decide which is right in case of disagreement boils down to execution of the HOL4 model in the logic (plus going and re-reading the IEEE 754 specification); at which point, we start to care about whether the HOL4 model of FMA is right.
Another thing we could do is to execute test cases created by IBM's FPGen formal model against the HOL4 model
I'm happy to be advised by those more knowledgeable than I. For the moment, it sounds as if this still up in the air, but I'm happy with the prospect of pull requests in my future.
Interesting, one of the test cases we get by tandem execution of a CPU against a simulator his something close to this: nmsub.s of all (positive) zeros on RISC-V.
Ok, so I now have a program that will take test cases generated from IBM;s FPGEN formal model and turn them into theorems for HOL4 to prove.
I am not at present testing the sign of zero, but all the tests of fused multiply-add passed. (So HOL4's model agrees with IBM model on the test cases, except possibly for the sign of zero which I'm not testing).
Good to hear! Thanks.
However, see also: http://isabelle.systems/zulip-archive/stream/247541-Mirror:-Isabelle-Users-Mailing-List/topic/.5Bisabelle.5D.20Wrong.20semantics.20for.20fmul_add.20in.20afp.2FIEEE_Float.2E.2E.2E.html which suggests that the behaviour exercising them was indeed the sign of the zeroes.
Thanks for the reply!
Checking the sign of the zeros is what I'm going to do next. I agree that the sign of zero is (probably) where the bug is.
Ok, I now have some failing test cases.
IBM's FPgen has: b32*+ =0 i +1.000000P-126 -1.2801D9P-113 -Zero -> -Zero xu
but ...
val f1 = <| Sign := 0w; Exponent := 1w; Significand := 0x0w |> : (23,8)float
;
val f2 = <| Sign := 1w; Exponent := 14w; Significand := 0x2801d9w |> : (23,8)float
;
val f3 = <| Sign := 1w; Exponent := 0w; Significand := 0x0w |> : (23,8)float
;
val expected = <| Sign := 1w; Exponent := 0w; Significand := 0x0w |> : (23,8)float
;
EVAL 1w = (SND(float_mul_add roundTiesToEven ^f1 ^f2 ^f3)).Sign
;
Seems like the expression in the quoted definition in the first post might need changing.
The new/fixed Isabelle code (from the development version of the AFP) has:
definition fmul_add :: "roundmode ⇒ ('t ,'w) float ⇒ ('t ,'w) float ⇒ ('t ,'w) float ⇒ ('t ,'w) float"
where "fmul_add mode x y z = (let
signP = if sign x = sign y then 0 else 1;
infP = is_infinity x ∨ is_infinity y
in
if is_nan x ∨ is_nan y ∨ is_nan z then some_nan
else if is_infinity x ∧ is_zero y ∨
is_zero x ∧ is_infinity y ∨
is_infinity z ∧ infP ∧ signP ≠ sign z
then some_nan
else if is_infinity z ∧ (sign z = 0) ∨ infP ∧ (signP = 0)
then plus_infinity
else if is_infinity z ∧ (sign z = 1) ∨ infP ∧ (signP = 1)
then minus_infinity
else let
r1 = valof x * valof y;
r2 = valof z;
r = r1+r2
in
if r=0 then ( ― ‹Exact Zero Case. Same sign rules as for add apply. ›
if r1=0 ∧ r2=0 ∧ signP=sign z then zerosign signP 0
else if mode = To_ninfinity then -0
else 0
) else ( ― ‹Not exactly zero: Rounding has sign of exact value, even if rounded val is zero›
zerosign (if r<0 then 1 else 0) (round mode r)
)
)"
where zerosign
is defined:
definition zerosign :: "nat ⇒ ('e ,'f) float ⇒ ('e ,'f) float"
where "zerosign s a =
(if is_zero a then (if s = 0 then 0 else - 0) else a)"
HOL4 would write POS0
or NEG0
inside the nested if
.
The new Isabelle code looks right to me.
Do you want to have a try at fixing this bug? I can test it if someone else proposes a fix.
I’ll give it a go.
So, this bug is HOL4's floating point is not very likely to matter for real programs, but here is a moderately realistic program (in Standard ML) where it matters:
fun dot_product (sum, _, []) = sum | dot_product(sum, [], _) = sum | dot_product (sum, x::xs, y::ys) = dot_product(Real.*+(x, y, sum), xs, ys);
A theorem you might want to prove is that if dot_product(~0.0, a, b) = ~0.0, then the dot product (computed to arbitrary precision, without rounding) is negative. (If the dot_product is positive 0.0, then nothing can be said).
I think this is what the feature in IEEE 754 is for ... if you have an underflow in computing the dot product, you don't get to know its value even approximately, but you can know that it was negative.
N.B: There's no need to waste time actually trying to prove this theorem; this is just by way of justifying why that case is there in the IEEE 754 specification.
P.S. In CakeML, it looks something like this:
fun dot_product sum a b = case a of [] => sum | x::xs => (case b of [] => sum | y::ys => dot_product (Double.fma(x y sum)) xs ys);
I've made what I think is the appropriate update to the definition of float_mul_add
.
Is there an analogous update that should be made to float_mul_sub
?
I've included your test-case from above, but if you have any more in a similar vein, I'd be happy to include them.
I have just run the fused-multiply-add tests from IBM's FPgen against your revised float_mul_add, and all the tests now pass. So I think we have fixed the bug in float_mul_add.
Yes, we probably should make the analogous update to float_mul_sub too.