feature: don't round float values when decoding
Is your feature request related to a problem? Please describe.
I'm developing a library related to CBOR. I haven't run into an issue with this yet, but when I noticed that float values can be silently truncated this surprised me. I'm interested in the rationale for this, and whether you would be open to changing this behaviour.
For example, when decoding the float64 value 1.50000000000000022204 (fb3ff8000000000001 in CBOR hex) into a float32, this library rounds the value to the closest float32 value, 1.5.
My main concern with this is that there is silent data loss.
Describe the solution you'd like
The overflowFloat32 function is changed to check that an exact value can be transferred, not just min/max checking. This could be behind a decoding option, but I think mandating it or making it the default is worth considering as well. Let me know what you think.
Describe alternatives you've considered
One alternative is to just always use float64 when decoding. In practice I think this is what most users will do, and so this isn't a major issue.
From https://github.com/hyphacoop/go-dasl/issues/7:
@makew0rld Can you write a round-trip test for this and put it on https://go.dev/play/ so I can take a look while AFK?
@x448 sure, although to be clear it's not round-trip. The situation:
Bob marshals a float64 into CBOR with value 1.50000000000000022204 Alice decodes it into a float32 Now Alice has value 1.5, losing data without realizing it
Here's a code sample: https://go.dev/play/p/Iz-mFrMZv3y
@makew0rld here is a sample without using fxamacker/cbor: https://go.dev/play/p/Lgt1_tBdWRW
Without using fxamacker/cbor, the floating point value still loses some precision:
package main
import (
"fmt"
)
func main() {
var f64 float64 = 1.50000000000000022204
var f32 float32 = 1.50000000000000022204
fmt.Printf("float64: %v\n", f64)
fmt.Printf("float32: %v\n", f32)
}
Output is:
float64: 1.5000000000000002
float32: 1.5
Floating point is silently lossy without this package.
@fxamacker what do you think about adding an opt-in decoding option to make it check? Would decoding be slower for everyone whether or not the option is used?
Without using fxamacker/cbor, the floating point value still loses some precision
Just to clarify, the float64 in this example is not losing precision. fmt is just printing out a condensed form.
float32 is losing precision, as Go is taking an arbitrary precision decimal and casting it. My point is just that this library doing that while decoding is perhaps surprising. Open to discussion on this, maybe it wouldn't surprise users. For reference, encoding/json is even worse.
@makew0rld Thanks for opening this issue! I think you raised some valid points.
When decoding CBOR float values into float32, the codec uses functions in Go's reflect package which does the rounding.
This library and Go's built-in encoding/json show the same behavior, but this rounding behavior can still be surprising so I will document it.
For reference, encoding/json is even worse.
BTW, the following program outputs the same for encoding/json and this library:
float32: 1.50000000000000000000, <nil>
float64: 1.50000000000000022204, <nil>
float32: 1.50000000000000000000, <nil>
float64: 1.50000000000000022204, <nil>
https://go.dev/play/p/EJsAVVzQN8v
package main
import (
"encoding/hex"
"encoding/json"
"fmt"
"github.com/fxamacker/cbor/v2"
)
func js() {
var f float32
err := json.Unmarshal([]byte(`1.50000000000000022204`), &f)
fmt.Printf("float32: %1.20f, %v\n", f, err)
var f2 float64
err = json.Unmarshal([]byte(`1.50000000000000022204`), &f2)
fmt.Printf("float64: %1.20f, %v\n", f2, err)
}
func cb() {
b, _ := hex.DecodeString("fb3ff8000000000001") // 1.50000000000000022204
var f float32
err := cbor.Unmarshal(b, &f)
fmt.Printf("float32: %1.20f, %v\n", f, err)
var f2 float64
err = cbor.Unmarshal(b, &f2)
fmt.Printf("float64: %1.20f, %v\n", f2, err)
}
func main() {
js()
fmt.Println()
cb()
}
The overflowFloat32 function is changed to check that an exact value can be transferred, not just min/max checking.
The general approach sounds good to me, although I can't change overflowFloat32 function because it is in the reflect package,
This could be behind a decoding option, but I think mandating it or making it the default is worth considering as well. Let me know what you think.
Since this isn't a security bug in this library, we cannot introduce breaking changes. Given this, we can support this check as an opt-in decoding option.
I can add this feature to the next milestone and open a PR to resolve this when time allows.
Thanks again for opening this issue.
Thanks for the corrected program. Adding this feature would be appreciated. It's possible I will also get around to a PR for this, I'll update this issue if so.
I know this is not in an acceptable state, but for posterity here's how I achieved this in my fork, it's quite simple:
@@ -1651,7 +1658,16 @@ func (d *decoder) parseToValue(v reflect.Value, tInfo *typeInfo) error { //nolin
case additionalInformationAsFloat64:
f := math.Float64frombits(val)
- return fillFloat(t, f, v)
+ err := fillFloat(t, f, v)
+ if !d.dm.keepFloatPrecision || err != nil {
+ return err
+ }
+ // No error and we need to maintain float precision
+ if f == float64(float32(f)) {
+ return nil
+ }
+ return &UnmarshalTypeError{CBORType: t.String(), GoType: v.Type().String(),
+ errorMsg: "float64 value would lose precision in float32 type"}
default: // ai <= 24
if d.dm.simpleValues.rejected[SimpleValue(val)] {
I know this is not in an acceptable state, but for posterity here's how I achieved this in my fork, it's quite simple:
@@ -1651,7 +1658,16 @@ func (d *decoder) parseToValue(v reflect.Value, tInfo *typeInfo) error { //nolin case additionalInformationAsFloat64: f := math.Float64frombits(val) - return fillFloat(t, f, v) + err := fillFloat(t, f, v) + if !d.dm.keepFloatPrecision || err != nil { + return err + } + // No error and we need to maintain float precision + if f == float64(float32(f)) { + return nil + } + return &UnmarshalTypeError{CBORType: t.String(), GoType: v.Type().String(), + errorMsg: "float64 value would lose precision in float32 type"} default: // ai <= 24 if d.dm.simpleValues.rejected[SimpleValue(val)] {
Hi @makew0rld, please feel free to open a PR for this if you are OK with contribution guidelines. E.g., pull requests have signing requirements and must not be anonymous. Exceptions are usually made for docs and CI scripts.
Created #701 to address this.