json
json copied to clipboard
Inserting values at pointers
Currently trying out the new JSON pointer implementation and I think this would be very useful in many cases.
One thing I have needed is inserting or assigning arbitrary JSON values at locations specified by JSON pointers. Is it currently possible? How can I achieve it in the current state?
Probably a duplicate of https://github.com/boostorg/json/issues/480
Not a duplicate. That issue is about finding an element by its path, and this one is about inserting an element at a specified path.
Just to be sure, this is about creating intermediary objects, right? Do you also have any suggestions on how to deal with non-existant elements of arrays?
Yes, non-existent objects are to be created at the insertion. For example,
value jv1 {{"a": {{"b", 1}}}};
value jv2 {{"c": {{"d", 2}}}};
jv1.insert_or_assign("/x", jv2);
jv1.insert_or_assign("/a/b", jv2);
assert(jv1.at_pointer"/x/c/d" == 2) // new value inserted at previously non-existent path /x
assert(jv1.at_pointer("/a/b/c/d") == 2) // new value assigned at the existing path /a/b
For non-existent array elements, I think std::vector assignment behavior might be preferable. Assign the value if the index is in range or else return an out-of-range/not-found error. Inserting to the back of an array can be specified using the - character according to RFC 6901. ~~A helper function to get the size of a json::array at a given pointer path may be needed as well.~~ Getting the size of objects/arrays via pointers is currently possible.
Further investigations with the current API:
For objects:
value jv = {{"abc", {{"xyz", {12, 13, 15}}}}};
jv.at_pointer("/abc/xyz") = {"123", "1231"}; // works as desired
jv.at_pointer("/abc/klm") = {"123", "1231"}; // does not work but desired
For arrays:
value jv = {{"abc", {{"xyz", {12, 13, 15}}}}};
jv.at_pointer("/abc/xyz/0") = {"123", "1231"}; // works as desired
jv.at_pointer("/abc/xyz/3") = {"123", "1231"}; // does not work as desired
jv.at_pointer("/abc/xyz/-") = {"123", "1231"}; // does not work but desired
We're having differences of opinion on what jv.insert_at_pointer("/a/2", 5) needs to create, an array or an object. @grisumbras thinks it should create an object and set its "2" key to 5. I find this consistent but kind of not really doing what one would want in the majority of cases. It seems to me that it should create an array and set its third element to 5 (leaving the first two null).
There are 3 questions we need to answer to decide on the behaviour.
- Whether to replace
nulls when creating intermediaries. - When creating intermediaries whether to treat a number segment of the path as an implication of array index or object key.
- Whether to extend arrays with null elements when a referenced index does not exist.
Positive answer to question 3 also implies positive answer to question 1, as otherwise there's very little utility in it.
Personally, I am leaning towards 1 - yes, 2 - no, 3 - no. The only way to create an intermediary array would be using - segement. That would also be the only way to extend an array.
First of all, I very much don't like option 3. I'm also not sure if it's actually needed, because - segment can be used. Without option 3 there will be no way reason to create arrays using number segments, just use -. So option 2 becomes irrelevant.
Meanwhile, option 1 is very useful IMO, doesn't create inconsistencies and probably even simplifies implementation.
Examples with my approach:
value jv = parse(R"( {"a": null, "x": [2]} )");
jv.assign_at_pointer("/a/b/1", 3); // {"a": {"b": {"1", 3}}, "x": [2]}
jv.assign_at_pointer("/x/-", 4); // {"a": {"b": {"1", 3}}, "x": [2, 4]}
jv.assign_at_pointer("/y/-", 5); // {"a": {"b": {"1", 3}}, "x": [2, 4], "y": [5]}
jv.assign_at_pointer("/z/0", 6); // {"a": {"b": {"1", 3}}, "x": [2, 4], "y": [5], "z": {"0": 6}}
jv.assign_at_pointer("/x/1", 7); // {"a": {"b": {"1", 3}}, "x": [2, 7], "y": [5], "z": {"0": 6}}
jv.assign_at_pointer("/x/3", 8); // error: we don't extend arrays by more than one element
So, to help me understand if there's a case for supporting option 3, can anyone present a realistic use case?
There's no use case for (1) if you don't have (2)+(3).
(1) is useful if one has (3) and wants to maintain order independence as in
value jv = array();
jv.assign_at_pointer("/1/y", 5);
jv.assign_at_pointer("/0/x", 3);
giving the same result as
jv.assign_at_pointer("/0/x", 3);
jv.assign_at_pointer("/1/y", 5);
because there's an implicit null created by the first line.
On second thought, though, I no longer favor (3) because it makes denial of service attacks trivial ("/123456789"). So I agree that numeric indices should be constrained to [0, size()] (with - being the same as size().)
(1) yes, (3) no.
Still, there might be some rough edges regarding (1) though as it is not only limited to null values but all values that are not objects.
Also I think the minus sign shouldn't create a new array but treat it as a regular key if the stem is not an array. For example,
value jv = parse(R"( {"x": [2]} )");
jv.assign_at_pointer("/x/-", 5); // {"x": [2, 5]}
jv.assign_at_pointer("/y/-", 5); // {"x": [2], "y": {"-" : 5}}
The previous behavior can be obtained via assigning an array directly to the stem:
value jv = parse(R"( {"x": [2]} )");
jv.assign_at_pointer("/y", {5}); // {"x": [2], "y": [5]}
jv.assign_at_pointer("/y/-", 6); // {"x": [2], "y": [5,6]}
RapidJSON implements the behavior (3). The program below is problematic as @pdimov points.
#include "rapidjson/document.h"
#include "rapidjson/pointer.h"
using namespace rapidjson;
int main()
{
const char* json = R"({"my_obj": { "my_arr": [10,11,12]}})";
Document d;
d.Parse(json);
Document v;
v.Parse("20");
SetValueByPointer(d, "/my_obj/my_arr/999999999", v);
return 0;
}
Interesting. So RapidJSON says yes to 1, 2, and 3. And even converts other kinds of values to arrays and objects, not just nulls. Maybe we need 2 kinds of functions: one conservative and one "helpful".
@doganulus @pdimov can you check out #713 to see if you agree with the choices made in that implementation. The choices are:
- Nulls are replaced when creating intermediate elements.
- When creating intermediate elements segments are always treated as object keys, (in other words, only objects are created as intermediates).
- Arrays are extended by 1 element at most.
In other words, the answers to questions discussed previously are 1: yes, 2: no, 3: no.
@doganulus can you check out #713 again. It has been rewritten to be customisable. Does the provided functionality accomodate your use cases?
One question: Can we set values at intermediate keys as follows?
jv = value();
result = &jv.set_at_pointer("/a/b/c", "m"); //
result = &jv.set_at_pointer("/a/b", "n");
Or the option replace_any_scalar just implies that we can only replace the leaf values?
No, that option controls whether only nulls or any non-container can be replaced.
value jv = 1;
jv.set_at_pointer("/a/b/c", "m"); // exception
jv.set_at_pointer("/a/b/c", "m", {.replace_any_scalar = true}); // success