Fable icon indicating copy to clipboard operation
Fable copied to clipboard

floating ranges affected by precision loss

Open daz10000 opened this issue 2 years ago • 10 comments

EDIT: See this for the cause of the problem, and this for a workaround.

Description

The range [ 0.5..0.05..1.5 ] should include 1.5 and that value is omitted in python

Repro code

This F# code

for x in [ 0.5..0.05..1.5 ] do
    printfn $"{x}"

Expected and actual results

dotnet output

0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.8500000000000001
0.9
0.95
1
1.05
1.1
1.15
1.2000000000000002
1.25
1.3
1.35
1.4
1.4500000000000002
1.5

Python output (note 1.5 omitted)

$ python program.py
0.5
0.55
0.6000000000000001
0.6500000000000001
0.7000000000000002
0.7500000000000002
0.8000000000000003
0.8500000000000003
0.9000000000000004
0.9500000000000004
1.0000000000000004
1.0500000000000005
1.1000000000000005
1.1500000000000006
1.2000000000000006
1.2500000000000007
1.3000000000000007
1.3500000000000008
1.4000000000000008
1.4500000000000008

The generated python uses the range_double(0.5, 0.05, 1.5) function. Python ranges typically omit the final value. Not sure if that's an issue generally for ranges or specific to floating point numbers..

Related information

$ dotnet fable --version
4.0.0-snake-island-alpha-010

Windows 11, git bash.

daz10000 avatar Jun 30 '22 22:06 daz10000

For Range Python reuses the same implementation as JS in fable-library/Range.fs and I just realized this problems happens in JS too so I need to fix it 😅 I'm assuming this is because the last item goes 1.5 (probably 1.5000000002) and Fable just discards it instead of truncating to 1.5. This is the implementation: https://github.com/fable-compiler/Fable/blob/e54f2b12385d16c9eb8dbfeb47369071d709ecaf/src/fable-library/Range.fs#L3-L13

BTW, this is not exactly related but when checking I realized the generated local function declarations for the lambdas seem to have too many arguments. For example, this:

let integralRangeStep<'T when 'T: comparison> (start: 'T) (step: 'T) (stop: 'T) (zero:'T) (add: 'T -> 'T -> 'T) =
    let stepFn = makeRangeStepFunction step stop zero add
    Seq.delay(fun () -> Seq.unfold stepFn start)

Becomes:

def integral_range_step(start: _T, step: _T, stop: _T, zero: _T, add: Any) -> IEnumerable[_T]:
    step_fn : Callable[[_T], Optional[Tuple[_T, _T]]] = make_range_step_function(step, stop, zero, add)
    def arrow_117(start: _T=start, step: _T=step, stop: _T=stop, zero: _T=zero, add: Any=add) -> IEnumerable[_T]:
        return unfold(step_fn, start)
    return delay(arrow_117)

arrow_117 has the same arguments as the enclosing methods, but the lambda passed to Seq.delay in F# has no arguments (actually, a single unit argument). Should it be like this @dbrattli?

def integral_range_step(start: _T, step: _T, stop: _T, zero: _T, add: Any) -> IEnumerable[_T]:
    step_fn : Callable[[_T], Optional[Tuple[_T, _T]]] = make_range_step_function(step, stop, zero, add)
    def arrow_117() -> IEnumerable[_T]:
        return unfold(step_fn, start)
    return delay(arrow_117)

alfonsogarciacaro avatar Jul 01 '22 04:07 alfonsogarciacaro

Unfortunately this is trickier than I expected. Actually ranges in F# don't include the last value necessarily:

> [0..2..9];;       
val it: int list = [0; 2; 4; 6; 8]

The problem here is the way JS (and Python) add floats which loses precision on the way. Not sure what's the best way to fix this :/

alfonsogarciacaro avatar Jul 01 '22 07:07 alfonsogarciacaro

@daz10000 Not the ideal solution, but just to let you know, using decimal which has arbitrary precision just works:

for x in [ 0.5M..0.05M..1.5M ] do
    printfn $"{x}"

// Python output
0.5
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
1.30
1.35
1.40
1.45
1.50

alfonsogarciacaro avatar Jul 04 '22 00:07 alfonsogarciacaro

@alfonsogarciacaro Would division to get the int number of steps, then integer range for the number of steps and multiplication for each step work better to avoid accumulation of errors?

ncave avatar Jul 04 '22 02:07 ncave

@ncave That's a great idea! You mean something like this?

let rangeDouble (start: float) (step: float) (stop: float): float seq =
    if step = 0. then
        failwith "The step of a range cannot be zero"
    else
        let stepsNumber = (stop - start) / step
        if stepsNumber < 0 then [||]
        elif stepsNumber = 0 then [|start|]
        else
            Seq.delay(fun () ->
                0. |> Seq.unfold (fun i ->
                    if i > stepsNumber then None
                    else Some(start + (step * i), i + 1.)
                ))

Seems to work for the example above (at least in js), but now I'm puzzled by other samples which doesn't seem to be consistent either in dotnet or js :/

> [5.0.. 0.2 ..5.6];;

// dotnet: note last value is not included
val it: float list = [5.0; 5.2; 5.4]

// JS
5
5.2
5.4


> [5.0.. 0.1 .. 5.6];;

// dotnet: last value is included
val it: float list = [5.0; 5.1; 5.2; 5.3; 5.4; 5.5; 5.6]

// JS: last values is NOT included
5
5.1
5.2
5.3
5.4
5.5

alfonsogarciacaro avatar Jul 04 '22 02:07 alfonsogarciacaro

@alfonsogarciacaro Perhaps you need to get ceiling or round on the number of steps after you divide? Idk, I haven't looked at the dotnet implementation. Floating point representations are the same, so we should be able to make it work the same.

ncave avatar Jul 04 '22 03:07 ncave

Yes, probably at the end we'll have to port the code from FSharp.Core. Not entirely sure where the code we're using currently from fable-library comes from (did you write or did we take it from FunScript?) though I like that it's very concise so it adds very little to JS bundles.

alfonsogarciacaro avatar Jul 04 '22 03:07 alfonsogarciacaro

@alfonsogarciacaro Looks like the salient part is here:

  • do a division to get number of steps (as you do above)
  • if number of steps is a 32-bit integer (i.e. in int32 range and floor(steps) = steps), then do start + (step * i) (as you do above)
  • or else, do a float range (needs to take care of NaNs, +/- infinity, and avoid an endless loop by stopping when the next step addition results in the same number due to precision loss).

Something like this (seems to works on all tests above):

//TODO: handle NaNs, infinity
//TODO: handle negative steps

let start = 5.0;
let stop = 5.6;
let step = 0.1;

let steps = (stop - start) / step;

let minInt = -2147483648.0
let maxInt =  2147483647.0
let isInt32 = x => minInt <= x && x <= maxInt && Math.floor(x) === x;

console.log(`isInt32(steps) = ${isInt32(steps)}`);

if (isInt32(steps)) {
  for (let i = 0; i <= steps; i++) {
    console.log(start + (step * i));
  }
}
else {
  for (let i = 0, d = start; d <= stop; i++, d += step) {
    console.log(start + (step * i));
    if (d + step === d) break;
  }
}

ncave avatar Jul 04 '22 13:07 ncave

Thanks for digging into this. I hadn’t appreciated the complexity hidden in that subtle range request. Counting is definitely the right approach. You’ll never tame epsilon problems with floating point addition. There will always be some small “negligible” error.

https://www.monkeyuser.com/2021/negligible-error

daz10000 avatar Jul 04 '22 16:07 daz10000

@daz10000 Technically there are ways to increase precision through compensated summation.

//TODO: handle NaNs, infinity
//TODO: handle negative steps

let start = 0.5;
let stop = 1.5;
let step = 0.05;

for (let i = 0, c = 0, sum = start; sum <= stop; i++) {
  //console.log(sum); // this value is even more precise than .NET output
  console.log(start + (step * i)); // use this value to match .NET output
  
  // compensated summation
  let y = step - c;  // c is zero the first time around.
  let t = sum + y;   // Alas, sum is big, y small, so low-order digits of y are lost.
  c = (t - sum) - y; // (t - sum) cancels the high-order part of y; subtracting y recovers negative (low part of y)
  
  if (sum === t) break; // just in case, to prevent endless loops
  sum = t;           // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
}

Unfortunately this makes it almost too precise now ;) and doesn't match the F# .NET output for, say, [5.0 .. 0.2 .. 5.6] (.NET doesn't give you the last element 5.6 and is arguably incorrect in this case, while this more precise version does).

So it is an open question whether we want to match exactly the (arguably less correct) .NET behavior (see above) or not.

P.S. Just to be clear, while this more precise summation makes the error much smaller in general, it's still there if you pick the right values (e.g. [1.0 .. 0.1 .. 1.4] doesn't give you the last element 1.4, and the F# .NET output also does not have it).

ncave avatar Jul 04 '22 19:07 ncave