Request to support simd json
RapidJSON offers much more than just parsing, it helps you generate JSON and offers various other convenient functions. https://github.com/lemire/simdjson is not as convenient as RapidJSON but if we just have to parse the document simdjson seems to be a faster alternative.
Hey Hrishikesh! I've thought about adding simdjson integration in the past, my primary hesitation with it has just been that it's not actually a conformant json parser.
In my understanding, simdjson uses SIMD instructions to compute a bitmap of "structural elements" (characters like commas, curly braces, double quotes, etc) in the incoming string, which it then iterates over to infer the original JSON tree You get insane performance because the vector engines can chunk up the string and compute the bitmap in parallel, but the concept of "structural elements" isn't very well defined
It can definitely parse anything that is actually JSON, but it'll also parse a large set of documents that definitely are not JSON. The effective grammar is much looser
That being said, we could add it and just document the fact that it shouldn't be used if the input string isn't from a trusted source
The additional problem is that I'm not sure what kind of performance improvement we'll see, because after simdjson runs, Dart still needs to construct its representation of the document parsed by simdjson, which I'm assuming will be significantly slower. Testing that I've done on my personal laptop suggests that Dart is capable of serializing at around 600-700 MB/second, which will bottleneck the simdjson parsing performance which claims on the order of GB/second, but I'm definitely down to try it as an experimental thing and see what happens
Until then, if you install sajson and build with the -DDART_USE_SAJSON flag, you should see about a 2x improvement in parsing performance by just using sajson. sajson strikes a really nice balance between performance and conformance, and also happens to organize its parse tree in a way that's really complementary for the Dart lowering logic
Note that this switch to sajson will only significantly affect parsing performance when parsing in finalized mode So code like the following:
auto pkt = dart::packet::from_json(R"({"hello":"world"})", true);
auto buf = dart::buffer::from_json(R"({"hello":"world"})");
Would see the improvement, but code like:
auto pkt = dart::packet::from_json(R"({"hello":"world"})", false);
auto heap = dart::heap::from_json(R"({"hello":"world"})");
Will likely perform more like rapidjson because creating the mutable representation is so relatively expensive.
Just wanted to update here that I was apparently incorrect about the conformance bit with simdjson. I last looked at the project like 9 months ago and it's come quite a long way since then and now supports full document validation