go
go copied to clipboard
math: add Compare and Compare32
The slices.Sort doesn't support sorting slices of floating-point numbers containing Not-a-number (NaN) values, It's not very intuitive.
Once the slices package is merged into stdlib, the sort.{TYPE}s(e.g. sort.Ints, sort.Float64s) APIs may be semi-deprecated. But the slices package doesn't provide a simple way to sort slices of floats containing NaNs, it recommends using slices.SortFunc(x, func(a, b float64) bool {return a < b || (math.IsNaN(a) && !math.IsNaN(b))}), this function is very complex compared to using sort.Float64s.
Related CL: https://go-review.googlesource.com/c/exp/+/446155
Change https://go.dev/cl/446155 mentions this issue: slices: Sort support sorting slices of floating-point numbers containing NaN values
This is related to #33440. @smasher164 mentions the existence of a total-ordering predicate that could be used to sort floats with NaNs.
CC @eliben @randall77
Once the slices package is merged into stdlib, the sort.{TYPE}s(e.g. sort.Ints, sort.Float64s) APIs may be semi-deprecated.
I don't think they will, using sort.Float64s will be fine forever.
I think the only advantage of slices.Sort is that it will work on slices of named float64s (e.g. type F float64; a := []F{5,4,3}; slices.Sort(a), that won't work using sort.Float64s). And in that rare case one can use the described workaround.
Since Go1.18 is released, many users use the slices.Sort to replace sort.{TYPE}s, because the former is faster and more clear in almost all cases. From the point of view of the users, the former looks more like a replacement of the latter, they have almost the same function in fact except the slices.Sort doesn't support sorting slices containing NaNs.
The main concern is that users don't notice this special case, and the replacement may break their codes.
I think it's worth adding documentation to slices.Sort that sorting a float won't work. Maybe also there could be a comparator function in sort.
I think it's worth adding documentation to slices.Sort that sorting a float won't work. Maybe also there could be a comparator function in sort.
The documentation for Sort is currently:
Sort sorts a slice of any ordered type in ascending order. Sort may fail to sort correctly when sorting slices of floating-point numbers containing Not-a-number (NaN) values. Use slices.SortFunc(x, func(a, b float64) bool {return a < b || (math.IsNaN(a) && !math.IsNaN(b))}) instead if the input may contain NaNs.
Is this not sufficient documentation, @carlmjohnson ?
@zhangyunhao116 I'm not a big fan of this proposal at this time. The problem with sorting floats is clearly documented and a workaround is provided. It's easy to sort using SortFunc
That seems good to me.
In #50340 I think there was talk about adding sort.Compare which would take a comparable and return -1, 0, 1. If something like that is ever added there should also be CompareFloat, but it's not necessary on its own.
The problem with sorting floats is clearly documented and a workaround is provided. It's easy to sort using
SortFunc
Agree that the special case is clearly documented and has a workaround, but it's nice to support all the special cases as sort.{TYPE}s does :)
I did a simple poll about slices.Sort at my work. The question is
The
slices.Sortsorts a slice of any ordered type in ascending order. Intuitively, you think the function isA. A replacement of
sort.{TYPE}sAPIs, has the same output, and support all special cases. B.slices.Sortandsort.{TYPE}sAPIs exist side by side, and they may have different outputs in special cases.
51 people joined it, 82.4% (42) selected A, and 17.6% (9) selected B. Looks like many users of this API doesn't notice the special case.
The proposal still need more discussions, it's a rare case, and It's merely not as users intuitively expected. The slices.Sort work very well for all the general cases.
Special casing the generic type being float64 would solve sorting []float64, but it wouldn't solve sorting []struct {x float64}.
I don't see how we can special case this reasonably. Callers should not sort slices with NaNs in them.
What if instead, we provided a function in the math package like:
package math
// FloatCompare compares x to y according to the
// total-ordering predicate in IEEE-754, section 5.10.
// The result will be 0 if a == b, -1 if a < b, and +1 if a > b.
func FloatCompare[F constraints.Float](a, b F) int
// FloatLess reports whether a is less than b according to the
// total-ordering predicate in IEEE-754, section 5.10.
func FloatLess[F constraints.Float](a, b F) bool {
return FloatCompare(a, b) < 0
}
Thus, usage with the slices package can be something like:
var fs []MyFloat32
slices.SortFunc(fs, math.FloatLess[MyFloat32])
This would solve both the NaN issue from this issue and the -0 vs +0 issue from #33440.
More complicated composite cases like https://github.com/golang/go/issues/56491#issuecomment-1309218085 can be handled by passing a user-defined less function to slices.SortFunc that's implemented in terms of math.FloatLess.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
What if instead, we provided a function in the
mathpackage like:
Should FloatCompare go in math or sort? I can see the case for either.
ISTM, if we have that we should have Compare[Int ~int|~uint|etc](a, b N) int in the same package too.
My argument for it being in "math" is:
-
The other scalar kinds (int, uint) don't really have ambiguous definitions of ordering (i.e., the
<operator does exactly what you want). It's only really float that is special. -
The "math" package seems to already have special handling for IEEE-754. I proposed the names with a
Floatprefix to match the existingFloatXXbitsandFloatXXfrombitsfunctions that are IEEE-754 specific. There are other IEEE-754 specific functions likeIsNaNandNaN. -
I don't think a predicate for ordering is sufficient justification for it being "sort". For example, should we have moved
bytes.Compareto besort.BytesCompare, ornetip.Addr.Compareto besort.AddrCompare, ortime.Time.Compareto besort.TimeCompare?
Functionality for order should live with other functionality for the type (i.e., "math" for floats, "bytes" for []byte, "strings" for strings, "netip" for
netip.Addr, "time" fortime.Time, etc.).
That makes sense to me. My only other thought is that math.Compare(a, b float64) int would also make sense since all the functions in math take float64.
What if instead, we provided a function in the
mathpackage like:
This total-ordering predicate would be pretty simple to define as well, since it's just:
func sign(a, b int64) int {
if a < b {
return -1
}
if a > b {
return 1
}
return 0
}
func FloatCompare[F constraints.Float](a, b F) int {
x := int64(Float64bits(float64(a)))
y := int64(Float64bits(float64(b)))
x ^= int64(uint64(x>>63) >> 1)
y ^= int64(uint64(y>>63) >> 1)
return sign(x, y)
}
The only thing I'd point out is this ordering of NaNs is different from the way sort.Float64s does it. The total ordering predicate sorts -NaN < everything else < +NaN, while sort.Float64s does (+-)NaN < everything.
The special case in sort seems like a clear decline since it can't be done generally. Retitling to be about this math.FloatCompare that might be helpful for people writing their own.
I feel slightly better about this because it is defined by IEEE-754 and we are not making it up. That said, it still seems very niche. The definition is in https://irem.univ-reunion.fr/IMG/pdf/ieee-754-2008.pdf on page 40.
I don't think we should make this function the first generic one in math. It should probably just be func Compare(x, y float64) int.
My reason for proposing a generic version was because you can't roundtrip convert a float32->float64->float32 and preserve the exact bits for NaNs. A single bit is mangled in the float32->float64 conversion.
See this example: https://go.dev/play/p/sPn-yfpd-nK, where one of the bits gets mangled on my Ryzen 5900x. I don't know if this a Go-specific issue or a x86 issue.
This means that:
var x, y float32 = ...
Compare(float64(x), float64(y))
wouldn't be quite correct for NaNs. Arguably this case is esoteric, but Compare is already esoteric, so the people who use it probably care about correctness in the esoteric cases.
Options:
- Accept this oddity and just provide the float64-only API.
- Fix the float32->float64 conversion issue in the compiler (if possible?)
- Provide both
Compare32andCompare64or something.
I don't have a strong preference.
My feeling is to just provide the float64 API under the name math.Compare, and wait until math/v2 to provide a generic version along with generic everything else.
See this example: https://go.dev/play/p/sPn-yfpd-nK, where one of the bits gets mangled on my Ryzen 5900x. I don't know if this a Go-specific issue or a x86 issue.
This is squashing a signaling NaN to a quiet NaN. I believe all platforms do this on a float32->float64->float32 conversion. (Joe and I have been through this ugliness before, e.g., #27193 and #36400.)
I'd be ok with a Compare and Compare32. (Following the lead of Nextafter and Nextafter32.)
Writing this function generically without converting to float64 first would require the ability to either call Float32bits or Float64bits depending on the type of the argument, ~~which doesn't seem possible without converting to interface{} and type-asserting. I think either Compare32/Compare64 or Compare(x, y float64) is fine.~~
Edit: I forgot that you can use unsafe and get the float value's size.
func sign[I constraints.Signed](a, b I) int {
if a < b {
return -1
}
if a > b {
return 1
}
return 0
}
func FloatCompare[F constraints.Float](a, b F) int {
if unsafe.Sizeof(a) == 4 {
x := int32(math.Float32bits(float32(a)))
y := int32(math.Float32bits(float32(b)))
x ^= int32(uint32(x>>31) >> 1)
y ^= int32(uint32(y>>31) >> 1)
return sign(x, y)
} else {
x := int64(math.Float64bits(float64(a)))
y := int64(math.Float64bits(float64(b)))
x ^= int64(uint64(x>>63) >> 1)
y ^= int64(uint64(y>>63) >> 1)
return sign(x, y)
}
}
Compare and Compare32 sounds fine. It looks like maybe float32->float64 does preserve signaling and float64->float32 is what squashes it, in which case plain Compare would technically be OK. But maybe other chips are different.
It sounds like the API is:
func Compare(x, y float64) int func Compare32(x, y float32) int
and both functions return -1 when x compares before y, 0 when they compare the same, and +1 when x compares after y. The comparison is from PDF page 40 of https://irem.univ-reunion.fr/IMG/pdf/ieee-754-2008.pdf which puts negative NaNs before negative numbers and positive NaNs after positive numbers.
It looks like that comparison is exactly a bitwise comparison like int64(math.Float64bits(x)) compared with int64(math.Float64bits(y))
Can someone confirm that?
@rsc
int64(math.Float64bits(x)) compared with int64(math.Float64bits(y))
It's not exactly a bitwise comparison. When x and y are both negative,
cmp(int64(math.Float64bits(x)), int64(math.Float64bits(y)))
gives you an incorrect answer. For example, int64(math.Float64bits(-3)) < int64(math.Float64bits(-2)) is false.
What's needed is a check for the sign bits of x and y. If they're both negative, flip the comparison.
So something like
x := int64(math.Float64bits(float64(a)))
y := int64(math.Float64bits(float64(b)))
if math.Signbit(a) && math.Signbit(b) {
return sign(cmp(y, x))
}
return sign(cmp(x, y))
This can be done branchlessly as well of course, by toggling all the bits based on the sign bit.
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group
Change https://go.dev/cl/459435 mentions this issue: math: add Compare and Compare32