draft-ietf-jsonpath-base
draft-ietf-jsonpath-base copied to clipboard
Add a new selector type: regex selector
We have some fields in object/array , which is generated by backend, with name header_$id, I would like to select it with regex.
Hey there @He-Pin. We actually have some support for that in the RFC. There are two regex functions, match() and search().
match() is implicitly anchored and will match on the full string.
search() is unanchored and will match on substrings.
Both use a flavor of regex called i-regexp, which was developed to be a compatible subset of most commonly used regex engines.
I checked that but seem will not match our usage.
{
"data": {
"header_1": {
"a": "1",
"b": "2",
"body": "{\"c\":\"3\"}"
},
"header_2": {
"a": "1",
"b": "2",
"body": "{\"c\":\"3\"}"
}
}
}
background: we want to select some json fields for translation. tried java jsonpath implementation. as the json above, we want to select the fields header_1 and header_2 first, does that supported with the current rfc?
I was using $.data[?(@.keys() =~ /header_\d+/i)] but doesn't work. so now, I'm implementation one base on the RFC and with the extended grammar:
regexSelector: /string-literal/
then I can write $[data][/header_\d+/]
as you can see, the main point here we are select on object children's property name
Oh, you want the property names to be matched, not the values.
That, I think is likely going to be covered by #516, which is the piece you're missing. Once you can access the property names, you should be able to pass them into the functions.
I think we need the pointer to the child property name, maybe key() not keys().
@gregsdennis as https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/109 , I have implemented this with a new selector RegExpSelector which works on ObjectNode's properties' name.
@He-Pin it's great that you've been able to implement it, but be aware that because it's not a standard behavior, it's not interoperable.
We'll leave this open as an idea for a possible JSON Path v2, but there's no such discussion at the moment. Continuing to push this idea in the short term isn't going to make that happen any faster.
Understand , as it's an internal needs, which should be fine.
Another aspect of adding a regex selector is that there's no way to specify what kind of matching you want, which is why we have match() and search() functions rather than a simple ~= operator.
Yes, as it's a valid name too. but the name selector is inside '$name' but the regex selector inside a /$regex/
An update of this, we are currently using :
* `/ $regexExp / $flags`
* */
private def regex[_: P]: P[Unit] = P("/" ~/ nonSlashOrEscapedSlash ~ ("/" ~ CharIn("idmsuUx").rep()))
`regexp-selector` | `name-selector` | `wildcard-selector` | `slice-selector` | `index-selector` | `filter-selector`
I think one advantage of regexp-selector is it more lightweight than the search function, which will not require use to evaluate through the filter-expression-evaluator but still covers 80% of cases.
And there are real-world needs for this , refs: https://github.com/json-path/JsonPath/issues/949
Edit: yes I see the difference. The regex needs to apply to the key, not the value.
~there are real-world needs for this~
~That issue is not indicative of a "need". The spec offers a solution. Yes, it's more verbose, but it also more explicitly expresses the intent of the path, which means it's more interoperable (the same path will evaluate consistently across implementations).~
I think this is a possibility for a potential JSON Path 2.
Yes, our current implementation is :
private void evaluateRegExpSelector(final Node match,
final Pattern pattern,
final boolean isLastSegment,
final boolean isDescendant,
final Consumer<Node> resultNodeCollector) {
final var node = match.currentNodeValue();
if (node instanceof ObjectNode objectNode) {
for (Map.Entry<String, JsonNode> member : objectNode.properties()) {
final String key = member.getKey();
if (pattern.matcher(key).matches()) {
final var value = member.getValue();
final var location = match.location().append(key);
final var newNode = newNode(objectNode, value, key, location, isLastSegment, isDescendant);
resultNodeCollector.accept(newNode);
}
}
increaseComplexity(objectNode.size());
}
}
Where we test the regex with the children's property name, pattern.matcher(key).matches()
As I had mentioned before, a choice will need to be made for match vs `search semantics. Or maybe a syntax that allows the user to specify which they want.