megaparsec icon indicating copy to clipboard operation
megaparsec copied to clipboard

Text.Megaparsec.Byte.Lexer.decimal returns wrong value for Float

Open jchia opened this issue 2 years ago • 2 comments

ghci> parse (decimal @() @ByteString @_ @Float) "" "655361200000"
Right 6.5536125e11
ghci> (== 655361250000) <$> parse (decimal @() @ByteString @_ @Float) "" "655361200000"
Right True
ghci> (== 655361200000) <$> parse (decimal @() @ByteString @_ @Float) "" "655361200000"
Right False
ghci> (subtract $ fromIntegral @_ @Float 655361200000) <$> parse (decimal @() @ByteString @_ @Float) "" "655361200000"
Right 65536.0

The Float value produced is not equal to what fromIntegral @_ @Float 655361200000 returns.

It appears that decimal accumulates floating-point errors during parsing so that the final Float result is wrong. (Given the limited precision of Float, error is expected but the result is further from the perfectly accurate value than necessary.) The error probably accumulates from lossy arithmetic performed for each input character, including the (* 10).

Double is likely similarly affected, but I have not constructed a counter-example, which probably requires more digit than Float counter-example.

jchia avatar May 11 '22 03:05 jchia

Yeah, this is not ideal. IIRC I mostly copied the implementation from attoparsec. I'm going to add a warning to the docs for now.

mrkkrp avatar May 29 '22 13:05 mrkkrp

Wouldn't it help if decimal first parses into a non-lossy type such as Integer and then used fromInteger? For the same reason the float parser uses Scientific under the hood.

olafklinke avatar Jun 21 '23 20:06 olafklinke