jq icon indicating copy to clipboard operation
jq copied to clipboard

jv: Add some support for 64 bit ints in a very conservative way (ALTERNATIVE)

Open nicowilliams opened this issue 8 years ago • 12 comments

This is a variant of #1246 that doesn't change the size of a jv.

@dequis What do you think?

nicowilliams avatar Jan 29 '17 01:01 nicowilliams

I'm not a fan of the JQ_OMIT_INTS ifdefs, they smell like premature optimizaition, probably worth benchmarking to see if they are justified.

You should probably change the title of this PR because it's not very conservative, heh. In particular, messing with the union was something i explicitly wanted to avoid - my implementation only affected display and didn't have to deal with other parts of the code. But everything seems to go through jv_number_value() and jv_number() so if you think that's fine I think that's fine.

Having signed 64 bit ints too is neat.

Only skimmed, haven't reviewed much. Also haven't tested.

dequis avatar Jan 29 '17 05:01 dequis

Arithmetic (and math) continues to go through jv_number() and jv_number_value(). Only parsing and printing check the integer members of the union, and the utility jv functions for C applications that want them. Not changing the size of jv, and not having multiple internal representations of any given number at once, seems conservative to me :) and anyways, I kept the title of your PR.

nicowilliams avatar Jan 29 '17 06:01 nicowilliams

As to the JQ_OMIT_INTS... I am copying the SQLite3 pattern. It's fairly cheap, but the problem is that it probably won't get much testing. It is not a premature optimization so much as... me not having measured the impact of this new feature.

nicowilliams avatar Jan 29 '17 06:01 nicowilliams

@nicowilliams - No doubt I'm missing something, but using @dequis's test case, it seems this change entails a step backwards:

github/nicowilliams/jq$ ./jq --version
jq-1.2-889-gb28c886

github/nicowilliams/jq$ git log -1
commit b28c886edbb45c2128f7f5eb79da169190473cff
Author: Nicolas Williams <[email protected]>
Date:   Sat Jan 28 18:56:55 2017 -0600

    jv int64 and uint64 support

github/nicowilliams/jq$ echo 111111111111111111 | ./jq -c '[., 1*.]'
echo 111111111111111111 | ./jq -c '[., 1*.]'
[111111111111111104,111111111111111100]

By contrast:

$ echo 111111111111111111 | jq1.5 -c '[., 1*.]'
[111111111111111100,111111111111111100]

pkoppstein avatar Jan 29 '17 06:01 pkoppstein

@pkoppstein Yeah, and that's not the only thing that's wrong. Try echo 2 63^p|dc|jq .. We need more test cases, that's for sure.

nicowilliams avatar Jan 29 '17 07:01 nicowilliams

Tests still needed.

nicowilliams avatar Feb 03 '17 00:02 nicowilliams

@wtlangford Can you review this? I think it's done.

Also, I'm thinking of taking this approach a bit further. I'm thinking of adding jv sub-types for:

  • binary data (sub-type of array, appearing to be an array of numbers in the range of 0..256 -- i.e., bytes)
  • array slices when the offset/size does not fit in a jv EDIT: See #1108.

nicowilliams avatar Feb 26 '17 01:02 nicowilliams

Ah, so, jv_identical() needs to be updated too.

EDIT: No, it's done already.

nicowilliams avatar Feb 28 '17 22:02 nicowilliams

We appear to continue to do double math with this change

In my version of this, that was explicitly intended behavior. The idea was to fix the common case of using jq as a pretty-printer and/or filter, so as long as the numbers aren't touched they keep their old values.

dequis avatar Mar 02 '17 19:03 dequis

Ah, then I have no complaints about the double-precision math. Additionally, maintaining existing behavior is easier than changing it and then deciding to change it back later; so we can always revisit in the future.

wtlangford avatar Mar 02 '17 19:03 wtlangford

Eventually we might add branches to basic arithmetic operators to avoid double-precision math when possible. But I'm a bit loathe to do that right now. The goal here really is to allow jq . to pass integers through unmodified. Of course, they'll still get parsed and formatted, but there won't be any scientific notation on output, which appears to be the most commonly desired thing here.

We could also fix the formatter to never use scientific notation for doubles with no non-zero decimal part. But we do also get calls for larger integer ranges than -2^52..2^52, which I think is fair enough.

nicowilliams avatar Mar 02 '17 20:03 nicowilliams

Thank you all for JQ.

It would be great if you could address this issue.

chrishmorris avatar Jul 17 '24 08:07 chrishmorris