go icon indicating copy to clipboard operation
go copied to clipboard

proposal: math/big: add Int.Float64 conversion (was initially: {ToInt64,ToUint64,Float64})

Open adonovan opened this issue 3 years ago • 12 comments

The Int type in the math/big package represents an arbitrary-precision integer. Today, it provides methods called Int64 and Uint64, which return the integer in the int64 and uint64 (machine) representations. However, it doesn't indicate whether the conversion is exact, so the caller must ascertain this, perhaps by a prior call to IsInt64 or IsUint64; otherwise, the result is undefined.

I propose to add three new methods to the package: Int.{ToInt64,ToUint64,Float64}. All three follow the same pattern of returning the closest representable value, and a big.Accuracy enum indicating whether the conversion was exact, a rounding up, or a rounding down. This matches the API of big.Float, which provides similar conversions to several types, including Int.

The To prefix is needed for ToInt64 and ToUint64 because the unprefixed names are taken.

Implementation sketch in https://go.dev/cl/453115. Tests to be added later, along with optimizations, such as a fast path of Float64 for 64-bit values.

@griesemer

adonovan avatar Nov 29 '22 19:11 adonovan

Change https://go.dev/cl/453115 mentions this issue: math/big: add Int.{ToInt64,ToUint64,Float64} conversions

gopherbot avatar Nov 29 '22 19:11 gopherbot

All three follow the same pattern of returning the closest representable value, and a big.Accuracy enum indicating whether the conversion was exact, a rounding up, or a rounding down.

Per https://pkg.go.dev/math/big#Accuracy:

Accuracy describes the rounding error produced by the most recent operation that generated a Float value, relative to the exact value.

It doesn't seem like too much of a stretch to apply that to a Float64 method, since one can think of Float64 as the composition of (*Float).SetInt and (*Float).Float64, but avoiding the allocation of the intermediate Float value.


I'm not so sure about ToInt64 and ToUint64, though — a big.Int represents exact integer values (all of ℤ), and int64 and uint64 each also represent exact integer values (subsets of ℤ). so converting, say, -9223372036854775808 to 0 in ToUint64() is not what I would call a “rounding error” — it is clamping, not rounding. Moreover, since the integer values are exact it is already possible for a caller to figure out whether a big.Int is above a 64-bit upper bound or below a 64-bit lower bound using a (*Int).Set method, an appropriate Min or Max constant from the math package, and (*Int).Cmp.

I guess you could argue that ToInt64 and ToUint64 are like the composition of (*Float).SetInt and (*Float).Int64 / (*Float).Uint64? But that seems like an awkward fit: I wouldn't expect an integer-to-integer conversion to round-trip through a lossy floating-point representation. 😅

bcmills avatar Nov 29 '22 20:11 bcmills

Floating point values are just as exact as Integers, they're just a different subset of the rationals. I don't see any fundamental difference in the idea that the range of Int and Float are strictly larger than int and float, and thus approximation is inevitable in conversion. Indeed, the ratio of cardinalities is much bigger (infinite) in the Int:int case. I'd be happy to adjust the wording on Accuracy to make clear that it applies to integer range conversions too.

adonovan avatar Nov 29 '22 20:11 adonovan

I guess I just don't see a compelling need for ToInt64 and ToUint64, given that it's easy to hoist the comparisons into big.Int. Since that comparison can be done using shared big.Int wrappers for the constants, it only requires O(1) extra allocations in order to perform arbitrarily many concurrent conversions.

In contrast, the value of the Float64 method is clear: today, converting a big.Int to a float64 today requires allocating an extra big.Float for each conversion, producing O(N) allocations for N concurrent conversions.

Moreover, it is much harder to check the Float64 conversions ahead of time using big.Int calculations: neither the transition from “exact” to “rounding error bounded by magnitude” (at 0x1p53 + 1) nor the transition from “bounded by magnitude” to “infinite rounding error” (at math.MaxFloat64 + 0x1p970) corresponds to any existing constant in the math package.

bcmills avatar Nov 30 '22 16:11 bcmills

I don't see any fundamental difference in the idea that the range of Int and Float are strictly larger than int and float, and thus approximation is inevitable in conversion.

The fundamental difference is that the bound on the rounding error for a converted float64 value is proportional to the magnitude of the converted value, whereas the bound on the rounding error for a converted int64 or uint64 value is discontinuous: it switches abruptly from “exactly 0” (at non-maximal values) to “unbounded” (at the two maximal values for each type).

bcmills avatar Nov 30 '22 16:11 bcmills

I don't think memory usage is the motivating factor here, though it is certainly true that the naive implementation of Float64() is allocation-heavy and a low-allocation version is non-trivial, which is a reason to include this operation in the API. But you're right that callers can implement efficient versions of the two {u,}int64 methods themselves without excessive difficulty, so it's fair to describe them convenience methods.

The fundamental difference is that the bound on the rounding error for a converted float64 value is proportional to the magnitude of the converted value...

That doesn't seem very important. If the rounded Int.Float64 result is (Infinity, Above), all you really know is that the integer was too large. So too with the (MaxInt64, Above) result from Int64 and Uint64: the value was too large. In that sense these Above/Below cases are more similar to indications of infinity than minor rounding errors.

Without loss of generality or efficiency, we could define the Int64 and Uint64 methods so that they return a bool instead of an Accuracy. True would mean "exact", false would mean "truncated to min[T] or max[T]". Would you prefer that?

adonovan avatar Nov 30 '22 18:11 adonovan

For ToInt64 and ToUint64, I'm just not sure that the API additions pull their weight. I'm also not sure why the saturating behavior (as proposed) would be any more useful than truncating (analogous to conversions in the Go language proper) — as far as I can tell they're both only useful in fairly marginal cases.

But, assuming for the sake of argument that saturating really is more useful, would it make more sense to define the existing Int64 and Uint64 methods to do that? Then the check might look something like:

b := new(big.Int)
…
i := b.Int64()
if (i == math.MaxInt64 || i == math.MinInt64) && !b.IsInt64() {
	// Conversion was inexact.
	…
}	

That's not much (if any!) more expensive than checking the accuracy return-value, and also wouldn't require any new API surface.

bcmills avatar Nov 30 '22 19:11 bcmills

The existing methods have no way to report inexactness, which is the most important motivation for the new {u,}int64 methods.

adonovan avatar Nov 30 '22 20:11 adonovan

Right, but IsInt64 and IsUint64 report inexactness — so folding them into a single method call seems more like a performance optimization than an API necessity.

bcmills avatar Nov 30 '22 20:11 bcmills

Right, but IsInt64 and IsUint64 report inexactness — so folding them into a single method call seems more like a performance optimization than an API necessity.

Fair enough. Let's retract the integer methods from the proposal then.

adonovan avatar Nov 30 '22 21:11 adonovan

I think @bcmills makes some good points. I am also in favor of simply adding

func (x *Int) Float64() (float64, Accuracy)

griesemer avatar Dec 21 '22 19:12 griesemer

Anyone else object to the reduced proposal?

adonovan avatar Dec 22 '22 19:12 adonovan

The proposed implementation in https://go.dev/cl/453115 is ready for review.

adonovan avatar Dec 27 '22 19:12 adonovan

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc avatar Jan 04 '23 19:01 rsc

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

rsc avatar Jan 11 '23 19:01 rsc

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

rsc avatar Jan 18 '23 19:01 rsc

Change https://go.dev/cl/500116 mentions this issue: math/big: rename Int.ToFloat64 to Float64

gopherbot avatar Jun 01 '23 22:06 gopherbot