inconsistent treatment of negative 0
Negative zero in an index selector is invalid:
index-selector = int ; decimal integer
int = "0" /
(["-"] DIGIT1 *DIGIT) ; - optional
DIGIT1 = %x31-39 ; 1-9 non-zero digit
Negative zero in a number within a comparable is explicitly allowed:
number = (int / "-0") [ frac ] [ exp ] ; decimal number
Practically, this inconsistency makes the parsing code more complicated. The lexer can't just raise an error when the lexer sees "-0", because it might be OK. So, this state has to be preserved until later on in the parsing stage when it can be determined from the abstract syntax tree if this is the "allowed" one or not.
Personally I would prefer to just allow negative zero. It's harmless, easy to parse and most programming languages handle it silently. If you are writing a parser that is good enough to detect this error, it is just as easy to treat it as positive zero and move on.
Thank you for mentioning this implementation complication. In an index position, negative numbers mean something: index from the right (reference = the first element that isn't there). -0 would really mean you want to go to that first element after the array. I think the confusion that becomes visible from expressing something like this is bad enough to warrant a little additional checking.
I have implemented the RFC :)
I got bit by this same issue when testing against the cts.json compliance file. Using the regex definition of Number given in the RFC, -0 will scan as a number literal. But not an int literal. In a typical scanner you would just create a NumberLiteral token for this value, and disambiguate it later as an int or float. Since floats cannot be index nor slice arguments, they should be rejected at the parsing stage. I had to refactor how I matched this in the lexer. If the pattern matches a number, I see if there is either a fractional part or an exponent part. If not, then it can't be a float, so I try to match an int. If it's a -0, it will fail to match either float or int and it produces a syntax error in the lexer.
Edit: Oh, I forgot that this change I made introduced the same bug you mention about comparable. I now fail the query "'$[[email protected]==-0]" , as my scanner gives me a syntax error for -0 here.
I think the root issue is that in context, the -0 in an index or slice argument is an int, which is not allowed. But the -0 in a comparable could either be an int or a float. A -0 int would be invalid but a -0 float would be ok. I don't know how you tweak the grammar for this. Maybe you just need to implement some semantic logic at the point of reference to enforce these rules.
Well, this issue has been a pet peeve of mine for a while so I did a little more research into it.
The JSON-Path spec is just implementing the JSON spec ( RFC 8259 ) regarding numbers and ints.
Specifically, on page 7-8 of the JSON spec:
number = [ minus ] int [ frac ] [ exp ]
...
int = zero / ( digit1-9 *DIGIT )
So changing this behavior in the JSON-Path spec would make that change inconsistent with the JSON spec.
I would argue that we would need to get this change made in the JSON spec before changing the JSON-Path spec. I don't know how difficult that would be. Someone would have to do a detailed analysis on "why" this change is important first, and how it affects existing codebases. I naively think that this is just expanding an existing feature, rather than removing one, constraining one, or adding a new one. So this would be one of the simpler spec changes to make. But that's a naive opinion.
As I wrote above, the issue is that although in many programming languages you can write -0, if that literal is interpreted as an int, you lose the negative context of the original literal. Ints don't have a concept of positive zero vs negative zero. (Maybe they do in math in general (?), but not on digital computers.) So the "value" of an int literal written as '-0' will just be '0'.
Floats however can remember that the literal is -0 and not just 0. JavaScript itself will parse -0 as a float literal, preserving the negative sign for subsequent operations on the float.
So the problem with this in the JSON-Path spec is not so much using -0 as an index or slice value. That seems harmless enough in isolation. The problem is that a -0 is a float, not an int, and you can't use a float as an array or slice index. But you can use a float in this context for comparison:
$[[email protected]==-0] # perfectly valid float comparison
So, I have come full circle on this issue and now believe that no changes should be made to the spec, and we all just have to add a few more guards in our code to reject -0 ints.
So changing this behavior in the JSON-Path spec would make that change inconsistent with the JSON spec.
I would argue that JSON only defines numbers, and the int declaration is a portion of a number, not a full value unto itself.
With this in mind, I don't think that we necessarily need to change JSON first. We're defining "integer" in addition to what JSON defines.
So the problem with this in the JSON-Path spec is not so much using -0 as an index or slice value.
That is not a bug, but a feature. What do you think [-0] is supposed to mean? I do not understand why some of you seem to want to extend the syntax to allow non-sensical indexes/slice arguments.
Just as a reminder:
The ABNF rule int is defined as
int = "0" /
(["-"] DIGIT1 *DIGIT) ; - optional
DIGIT1 = %x31-39 ; 1-9 non-zero digit
... and is used in index and slice selectors:
index-selector = int ; decimal integer
slice-selector = [start S] ":" S [end S] [":" [S step ]]
start = int ; included in selection
end = int ; not included in selection
step = int ; default: 1
These are exactly as they should be.
As a shortcut, int is also used in JSON-Path's definition of number:
number = (int / "-0") [ frac ] [ exp ] ; decimal number
To allow IEEE 754 -0.0 (and its alternative JSON notation -0), we need to add an alternative here, and only here.
... and if your scanner implementation (you don't actually need a scanner for JSONPath) wants to mark number values that cannot be int values, you have to react to all three, -0, frac, and exp.