p4c icon indicating copy to clipboard operation
p4c copied to clipboard

Improve json library internals

Open asl opened this issue 1 year ago • 2 comments

json library is a heavy cstring user for no particular reason. Essentially, jsons are huge string-keyed dictionaries. Let's see the common example of usage (this is from bfruntime.cpp):

isFieldSliceAnnot->emplace("value", "true");

emplace itself is essentially:

JsonObject *JsonObject::emplace(cstring label, IJson *value) {
...
    ordered_map<cstring, IJson *>::emplace(label, value);

Lots of bad things here:

  • There are at least two map lookups here: one during the conversion of label to cstring. We have string cache lookup (or two if the value should be inserted) here. Plus, if the label is new, then it is copied and remembered forever
  • There is a second map lookup inside ordered_map::emplace. Now using cstring's data pointer. Given that ordered_map is just plain slow std::map this is not so good. Actually, there are two map lookups here due to presence of value check, but this could be optimized with better iterator usage.

As a results, jsons are quite expensive both memory- and performance- wise. Also, they pollute string cache for no particular reason. These strings are never released.

I think the proper solution would be to have a specialized StringMap container. Specialized towards string keys. Given that we'd always need to have string cache lookup currently, it will be uniformly better :)

asl avatar May 29 '24 20:05 asl

Is the invocation of these functions somewhere on the critical path for the compiler?

fruffy avatar May 29 '24 22:05 fruffy

Is the invocation of these functions somewhere on the critical path for the compiler?

Depends. But some backends essentially produce jsons only. I know also downstream cases where lots of jsons are emitted for control plane, etc.

asl avatar May 29 '24 22:05 asl