codeql-cli-binaries icon indicating copy to clipboard operation
codeql-cli-binaries copied to clipboard

float.toString() result is lossy

Open Marcono1234 opened this issue 4 years ago • 6 comments

It appears the built-in predicate float.toString() causes precision loss for the result. This can be irritating and also prevents using it in QL InlineExpectationsTest because there the expected value has to be a string.

CodeQL's float uses 64-bit and is therefore able to represent for example 1.555555555; yet 1.555555555.toString() has 1.555556 as result. These are not the same values, 1.555555555 = 1.555556 does not hold.

Marcono1234 avatar Mar 18 '21 22:03 Marcono1234

Hi @Marcono1234,

Thanks for the report! I've forwarded this issue to the CoreQL team :)

MathiasVP avatar Mar 19 '21 12:03 MathiasVP

Looks like float.toString() uses Float.toString() or casting (while it should probably use Double.toString() instead):

jshell> Double.parseDouble("1.555555555")
$1 ==> 1.555555555

jshell> (float) $1
$2 ==> 1.5555556

intrigus-lgtm avatar Mar 19 '21 16:03 intrigus-lgtm

No, it's weirder than that. The implementation of toString() uses Java double throughout, but feeds those doubles to a DecimalFormat object that's explicitly configured to print only 6 decimal places. Nobody quite seems to remember what the purpose of that is (it was changed from Double.toString() in May 2011, says git), so we're taking some time to see if we can think of something important that would break if we fix it to preserve full precision.

hmakholm avatar Mar 20 '21 20:03 hmakholm

Thanks intrigus-lgtm for your check and for hmakholm the insight.

I could imagine that it was maybe done to emit output which is a valid CodeQL float literal. Double.toString() uses the scientific e notation, which is not supported by CodeQL. Though for non-finite numbers it will nonetheless have output which cannot directly be written as CodeQL float literal (e.g. "NaN".toFloat().toString()). Additionally the current CodeQL float.toString() result for large numbers is also pretty difficult to read, e.g.:

"1e100".toFloat().toString()
// Result: 10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104

So trying to emit something which is a valid CodeQL float literal, in case that was the intention, has its disadvantages.

Marcono1234 avatar Mar 20 '21 22:03 Marcono1234

This issue is stale because it has been open 14 days with no activity. Comment or remove the stale label in order to avoid having this issue closed in 7 days.

github-actions[bot] avatar Apr 14 '21 15:04 github-actions[bot]

Moving to codeql-cli-binaries so we have a place to track this arguable deficiency.

Once we have design clarity, the actual code fix ought to be simple.

hmakholm avatar Apr 14 '21 17:04 hmakholm