draft-ietf-jsonpath-base icon indicating copy to clipboard operation
draft-ietf-jsonpath-base copied to clipboard

Question - does a NormalizedPath require a json query data argument?

Open rob-ross opened this issue 6 months ago • 2 comments

Figure 3 of RFC 9535 indicates that a normal-index-selector is a non-negative decimal integer.

Thus a query using a negative index like -3 would be normalized by changing the index to a positive value. However, this requires having a concrete JSON data value in order to perform the normalization, while a query with positive indexes do not.

This seems unfortunate to me, as you can have a well-formed and valid query that cannot be normalized until evaluated against data if it contains negative indices, but that can be normalized when it does not.

Am I thinking about this wrong? Is it the intent that all Normalized Paths must be evaluated first against data to create a Normal Path? And if not, what do people generally do when they have a path they can otherwise normalize, but it contains negative indices?

Thanks!

rob-ross avatar Jul 09 '25 20:07 rob-ross

Is it the intent that all Normalized Paths must be evaluated first against data to create a Normal Path?

As I understand it, yes. As well as a consistent string representation, a normalized path must be a singular query identifying exactly one node in the query argument (the data).

what do people generally do when they have a path they can otherwise normalize, but it contains negative indices?

I have my own definition of "canonical path" for serializing a "compiled" JSONPath query back to a string. From the README here:

canonical_paths.json contains test cases for serializing a "compiled" JSONPath query back to a string. The canonical representation of a compiled query is not a normalized path, is not necessarily a singular query and is not specified by RFC 9535.

For our purposes, a canonical path is one that (these things are up for debate):

  • uses bracket notation rather than shorthand selectors
  • uses single quotes rather than double quotes
  • includes the default step of 1 in slice selectors
  • excludes enclosing parentheses in filter selectors. ?<expression> rather than ?(<expression>)
  • minimizes parentheses in logical expressions
  • follows escaping rules defined in section 2.7 of RFC 9535.
  • uses exactly one space character either side of logical and comparison expressions
  • uses exactly one space character after a comma

jg-rp avatar Jul 13 '25 06:07 jg-rp

I believe the spec also states that a well-formed and valid query should not return any errors when it is being evaluated. If the query references a non-existent node, it should just return Nothing instead of an index/key error. So this implies (to me) that a normalized path could refer to a non-existent node in a value, say the 10th item in an array of size 9, and it would still be considered a normalized path. (Please correct me if I'm wrong about this.)

Assuming for the sake of argument that I am correct, then there really is a dichotomy in negative/positive index values in normalized paths. We can't use negative index values in a normalized path, so we need to evaluate the query with a value first to be able to translate the negative indices into positive ones representing the actual locations in the value array. Whereas, there is no such requirement for positive indices. They can be used in a normalized path even when they don't refer to an existing array element. So the spec "feels" a little inconsistent on this topic.

From my reading of the above, your canonical path doesn't exclude negative array indices, correct? In the case of a canonical path for a singular query using positive indices, this would be the same as a normalized path string, right? And in the case of negative indices but still a singular query, the negative indices would be the only thing preventing it from being a normalized path.

I like this concept. It lets me construct a canonical path without evaluating a json value, and it works for general queries and singular queries as well. And it preserves negative indices which are useful among other reasons when you want to select the "last" item in an array.

I think I will adopt your concept of "canonical path" in my own project.

Thanks!

  • Rob

rob-ross avatar Jul 17 '25 20:07 rob-ross