JSON precision loss on `copyCurrentEvent()` for floats that require greater than DOUBLE precision
Feature Request
I'd like a way to determine if getNumberType() is guaranteed to return an exact response or might lose precision if parsed to the corresponding native type.
Motivation
The JSON JsonParsers always return NumberType.DOUBLE -- never FLOAT or BIGDECIMAL. Consequently, I've been (incorrectly) parsing all JSON JsonToken.VALUE_NUMBER_FLOAT as doubles, losing precision for anything that doesn't fit. These comments suggest that the reason for always returning DOUBLE is that BigDecimal parsing is expensive, so it's up to the jackson-core consumer to ask for one if they want it.
As a user of the JsonParser I don't know if I want a BigDecimal at the point I interact with the JsonParser. I'm pushing the value downstream to be interpreted later. I'd like to get either a primitive float/double or else a CharSequence. This is the approach I'm taking in https://github.com/rallyhealth/weePickle/pull/102.
However, converting all floats to strings is suboptimal when used with jackson-dataformats-binary. SmileParser knows that a double-tagged value (0x29) is 100% guaranteed to fit in a primitive double, but I have to convert it to a base10 string (and probably back to a base2 double later), since I can't trust getNumberType's answer of DOUBLE.
I've found https://github.com/FasterXML/jackson-core/issues/631, but for JSON, it appears always to return BigDecimal even for values like 0.0 that would fit in a double. I've benched about a 50% throughput penalty for sending all floats through BigDecimal.
AFAICT, it is not possible to determine whether or not to trust getNumberType() to avoid precision loss for any given JsonParser without hardcoding to its known behavior ahead of time (which may change in subsequent releases).
Proposal
I don't see a way to update the JSON JsonParser to return exact getNumberType() values without increasing upfront parsing cost, but if I had a signal that getNumberType() is trustworthy for a Smile JsonParser, but not for a JSON JsonParser, I could work around it. I would be able to ask for Strings for only non-exact values.
I think some method like boolean isNumberTypeGuaranteedExact() would satisfy my needs. Being able to ask for each token seems ideal. For the JSON JsonParser, it might return false for all JsonToken.VALUE_NUMBER_FLOAT.
WDYT? And especially, is there some other existing option that I've overlooked?
Thanks!
It just occurred to me that this causes precision loss in jackson-core as well when going from JSON => JSON (e.g. when pretty printing). Failing test in #731.
A simpler but more disruptive option might be to introduce a new NumberType.FLOAT_UNKNOWN_PRECISION.
Not commenting on earlier parts, but I will say that addition of a new NumberType enum value would lead to significant compatibility breakage.
I also thought I had added something in this space, but seems there's only JsonParser.getNumberValueExact() which is not quite it.
I think that since nature of exactness is mostly binary (exact usually) vs textual (usually not), it'd be per-parser feature: if so, could add new StreamReadCapability so caller could check if parser can provide exact value or not.
Or something along those lines.
caller could check if parser can provide exact value or not.
That'd work.
@pjfanning Forgot this one, related to work to try to retain best representation. Failing test accidentally referenced #733 (PR) so changed that name.
@cowtowncoder I'm not up to speed on this issue. Have we broken something?
@pjfanning No, not at all -- it's an older issue; basically value that is outside of double gets converted to infinity and not retained in its full accuracy. I do not remember too many details but there is a reproduction. So was hoping it might have been resolved by other work, but does not seem to be the case.
Hmmh. @pjfanning I think I may need to consider reverting functional change here, due to concerns wrt CBOR backend (and generally binary ones). As long as TokenBuffer can reliably retain accuracy, it is not 100% necessary for copyCurrentEvent() to try to do that -- I'll see how we changed that (with "Deferred" access, I think).
I am also thinking of adding new "copyCurrentEventExact()" method that would use the new mechanism, but then leave "copyCurrentEvent()" to use the old/existing (2.14) and before logic.
It is too bad we don't have tests that check for write sequences (although ones would be easy enough to add, either by Mocking or just sub-classing JsonGeneratorDelegate).
Replaced by #984; closing as duplicate.
Note: besides 2.15 adds lots of functionality to help (including #984), it's worth noting there is JsonParser.getNumberValueExact() (added in 2.12) that can help.