jv: Add some support for 64 bit ints in a very conservative way
This adds an extra int64_t field to jv, which is only written when parsing number literals (and only if they are integers within range), and only read when printing.
Any operations done over those numbers will downgrade them to a double with 53 bit precision. This is intentional for the scope of this patch:
$ echo 111111111111111111 | jq -c '[., 1*.]'
[111111111111111111,111111111111111100]
For ints between 53 and 64 bits, this matches the behavior of awk, as suggested in the bug tracker by tischwa:
$ echo 111111111111111111 | awk '{print $1, 1*$1}'
111111111111111111 111111111111111104
Fixes issue #369 (at least partially)
I tried to add tests but the stuff in jq.test seems to be parsed by jq itself, so they end up being comparisons between the double values, not the int64 ones.
This is an itch i've really needed to scratch for a long time. Using jq as a pretty printer for json, I almost started getting used to big numbers getting mangled. Not anymore!
Coverage increased (+0.4%) to 85.776% when pulling cf11ac03f44c8ba9f7948f8e50d05756b3a377d2 on dequis:64bit into 0b8218515eabf1a967eba0dbcc7a0e5ae031a509 on stedolan:master.
Hi. Thanks for this contribution. I left a comment on jv.h. Let us know what you think.
omg you're alive
heheh, yeah, I'm alive. Sorry for the absence :( I've been heads down in other things.
I'm working on an alternative version of this where a numeric jv can only be a double, and int64_t, or a uint64_t. The parser will use strtoumax() or similar when a numeric value has no exponent and no decimal, and will produce an a 64-bit integer, signed or unsigned as necessary if the integer is small enough to fit in 64 bits. The dumper will, of course, trivially dump 64-bit ints. There will be new jv functions to go with all of this.
My main fear is that this will slow things down. I may add macros for disabling this feature.
Why is it not acceptable to increase sizeof(jv) anyway? Is there some sort of API/ABI guarantee for other projects?
There's no ABI constraint on the size of a jv. But jvs are rather compact -- as compact as they can be, really, and this is useful because we pass them on the C and jq stacks a lot. There are some issues with jvs though, mostly the size of the size and offset fields, which are way too small and have bad overflow semantics, but even for fixing those I'd be reluctant to change the size of a jv.
Another thing is that having multiple number representations at once in any one jv seems like asking for trouble? Better just have one.
Another thing is that having multiple number representations at once in any one jv seems like asking for trouble? Better just have one.
I like a lot bitwise operations on integers, but I liked also XSLT in the past, and XSLT has only one kind of "numbers": let jq do the same?
Speaking of XSLT, how about implementing translate?
JJOR
BTW, your submission is awesome. I'm running with it and modifying it to make the int value a part of the u union in the jv, but please don't feel bad about that.
@fadado jq will only have one kind of number. The idea here is that when the input was an integer that fits in an int64_t then that should be the internal representation and when printed it should be printed the way a printf function would do it (hmm, though we need to be careful to not have any internationalization, e.g., thousands separators!), which is to say: with no exponent, no decimal.
@dequis Checkout #1327. What do you think?