evaluator: support mode/flag in which dropping down to float precision should result in an error
Currently, the CUE spec explicitly allows for implementations to drop down from arbitrary precision to a float precision. This is necessary, for example, for various functions in math.
However, in financial situations dropping down to float precision is more than likely an error.
Therefore, we might like to consider adding a mode/flag where dropping down to float precision is flagged as an error, unless there is an explicit cast/equivalent to float (see #253).
Adding this issue as a placeholder for discussion about what such a mode/flag might look like. Motivating examples to follow.
Noting #253 for context on proposed changes to number, float and int.
cc @disintegrator.
Adding some further context to this point following an exchange with @b4nst.
Consider: https://cuelang.org/play/?id=aZFUanhM79y#w=function&i=cue&f=eval&o=cue
x: 30*1*8000/0.65
x: 48000/13
With v0.0.0-20240503105822-dff77a6d2a5a this gives:
x: conflicting values 3692.307692307692307692307692307692 and 369230.7692307692307692307692307692:
./x.cue:1:4
./x.cue:2:4
i.e. we are losing precision in the evaluation of the first expression for x that leads to a different result to the second expression; the final decimal place differs.
Consider the following Go program using math/big that doesn't lose precision along the way:
package main
import (
"fmt"
"math/big"
)
func main() {
// 30*1*8000/0.65 & ((1/3)*2*15*1*8000/0.65)*3
v30 := big.NewRat(30, 1)
v1 := big.NewRat(1, 1)
v8000 := big.NewRat(8000, 1)
v0_65 := big.NewRat(13, 20)
v1_3 := big.NewRat(1, 3)
v2 := big.NewRat(2, 1)
v15 := big.NewRat(15, 1)
v3 := big.NewRat(3, 1)
res1 := v30
res1.Mul(res1, v1)
res1.Mul(res1, v8000)
res1.Quo(res1, v0_65)
res2 := v1_3
res2.Mul(res2, v2)
res2.Mul(res2, v15)
res2.Mul(res2, v1)
res2.Mul(res2, v8000)
res2.Quo(res2, v0_65)
res2.Mul(res2, v3)
fmt.Printf("res1: %v\n", res1)
fmt.Printf("res2: %v\n", res2)
}
which gives:
res1: 4800000/13
res2: 4800000/13
Indeed there is no loss of precision during manifestation either, because the result is presented as a fraction.
In writing all the above, there isn't anything in particular that I am proposing. Apart, perhaps from noting than a math/big.Rat-based approach would be able to retain full precision unless it encountered an operation that forced it to move to float-based arithmetic, e.g. square root, or where presentation/manifestation of a value force a decimal-based value as opposed to a fraction.
This can also happen when referencing a field with a single concrete value. Like this example:
#TimeSpan: {
start: float & (end-duration)
end: float & (start+duration)
duration: float & (end-start)
start: < end & >=0
}
t1: #TimeSpan & {
start: 1/3
duration: 10.0
}
t1.start: conflicting values 0.33333333333333333333333333333333 and 0.3333333333333333333333333333333333:
-:2:21
-:9:5
-:10:12
The inner cycle in #TimeSpan is probably what makes it being evaluated twice, but it's difficult for a non experienced user to understand why it's happening.
The inner cycle in
#TimeSpanis probably what makes it being evaluated twice, but it's difficult for a non experienced user to understand why it's happening.
Whatever the cause, this is a really good example to motivate the problem of dropping precision. Thanks @b4nst
I wonder if rather than using a flag, if we could denote the precision requirement by assigning a new number type e.g. decimal. I'm not familiar with the various implementations of lossless decimals but including the type with a wide support of arithmetic operations (additions, multiplications, exponents at a minimum) would solve this more directly.
This approach would favour "intent in code" as you wouldn't ever model something in CUE with the intention of using the precision flag, and then turning it off. And if you would need to ever map values to a "lossy number" then you could do this through a type coercion (if added) or exporter, or language API.
This approach would favour "intent in code"
I'd tend to agree with that as a guiding approach. Because it feels to me like float leaks too much through from the implementation. But that said, I think that decimal "leaks" for the same reason.
Instead we could lean on the existing definitions from mathematics. As a strawman:
real- rational and irrational numbersirrational- numbers that can be expressed as fractionsrational- numbers that can be expressed as fractionsint- integerswhole- integers 0, 1, ...
We could then define convenience "types" (which are actually constraints) for commonly used ones line uint16.
My terminology might be off, but I believe this then lends itself to symbolic computation or exact arithmetic. We would retain "calculation" of a field's value right until the last minute, i.e. the time it needs to be manifest as a concrete value.
Or (given such an approach is more expensive and cannot take advantage of special CPU instructions) there could be an entirely different mode of the evaluator that uses "regular" arithmetic.
This is certainly not my area of expertise, but https://pkg.go.dev/math/big is related I believe.
Just to throw out one other thought I had in passing when encountering this issue:
IME this issue is most commonly seen when unifying numeric results when there's a cycle involved.
Rather than using big.Rat everywhere, which is way less efficient to compute with, I wonder if a viable approach might be to somehow allow unification of floats that are close together in value. For example, we could potentially allow two floats to be consider the "same" if they're within some error margin of one another. Or even allow that kind of approximate unification only when there's a cycle involved.
Checking for exact equality of floats is generally considered an anti-pattern in regular code: why should CUE be any different?
Yet another motivating example, taken from https://github.com/tvandinther/units:
#Length: {
_millimeters: centimeters * 10
_millimeters: inches * 25.4
centimeters: _millimeters / 10
inches: _millimeters / 25.4
}
d: (#Length & {centimeters: 30.4}).inches
This fails with:
_millimeters: conflicting values 303.9999999999999999999999999999999 and 304.0:
./x.cue:4:16
./x.cue:5:16
Thanks @rogpeppe — that’s an interesting angle.
I can definitely see that allowing approximate unification of floats could be useful in pragmatic situations, particularly when resolving cycles — and it might well be something worth exploring as a targeted improvement.
But I don’t think this fully addresses the core issue I (and others) are concerned about — particularly for domains like finance, where exact control over precision is fundamental:
- In those contexts, it’s not acceptable to "approximate away" differences — we need to know precisely where precision is lost, and ideally retain full precision unless we make an explicit choice otherwise.
- If CUE were to silently unify based on an error margin, it would undermine the ability to reason soundly about financial computations, unit conversions, and similar domains.
One possible direction here — as I mentioned earlier — is to explore some form of symbolic computation, where expressions are preserved symbolically and only concretised when absolutely necessary (e.g. at manifestation or via an explicit conversion). This would allow:
- Full precision to be retained as far as possible,
- Loss of precision to be explicit and under user control,
- A more robust model for domains that rely on exact arithmetic.
Approximate float unification could still have a role — for example, as an opt-in strategy when dealing with cycles where exact precision isn’t required — but I’d be concerned if that became the default or the main way of handling these kinds of issues.
Specifically in the case of float, I think it could be nice to opt-in to approximate unification through some kind of additional syntax such as x: float(0.0001) or x: float & precision(0.0001) (whatever makes sense) to functionally operate as a typical floating point comparison: if (abs(a - b) < 0.0001). In this way, it can be opt-in. I agree that this shouldn't be made a default, but something users consider when dealing with floats. A floating point conflict which is detected to have a difference within an acceptable margin of error could also add a help message to the error suggesting the usage of the opt-in precision syntax.
Specifically in the case of float, I think it could be nice to opt-in to approximate unification through some kind of additional syntax such as x: float(0.0001) or x: float & precision(0.0001) (whatever makes sense) to functionally operate as a typical floating point comparison: if (abs(a - b) < 0.0001). In this way, it can be opt-in. I agree that this shouldn't be made a default, but something users consider when dealing with floats. A floating point conflict which is detected to have a difference within an acceptable margin of error could also add a help message to the error suggesting the usage of the opt-in precision syntax.
i was thinking something similar when i reached for float(myint, 0.0001) and it didn't work to do a typecast.