jj icon indicating copy to clipboard operation
jj copied to clipboard

README.md - timings and maxrss

Open pkoppstein opened this issue 6 years ago • 2 comments

On the README.md page, some comparisons with jq are made. Since jq also has a streaming parser, I believe it would be helpful to compare the performance of the two streaming parsers.

It would also be helpful to add something about memory utilization; since my /usr/bin/time gives maxrss, I've given that below.

Here are the results of my timings on a 3GHz 16MB RAM machine:

(i) jj 'features.10000.properties.LOT_NUM' -i citylots.json

091
user   0.01s
sys    0.14s 
maxrss 197627904

(ii) jq-1.5 -n --stream 'first(inputs | select(.[0] == ["features",10000,"properties","LOT_NUM"])) | .[1]' citylots.json

"091"
user   0.60
sys    0.00
maxrss 2084864 

(iii) As above but with jq-1.6rc1

"091"
user   0.61
sys    0.00
maxrss 2072576

pkoppstein avatar Feb 26 '18 04:02 pkoppstein

Hi pkoppstein, I'm looking into these issues. I'm believe jj is buffering too much data prior to processing and low memory systems suffer when dealing with large json files. I'll look asap and keep you posted. Thanks!

tidwall avatar Feb 28 '18 04:02 tidwall

@tidwall - No doubt jj's memory utilization could be improved but please understand that that was not the point of this "issue". I was just suggesting that (a) an apples-to-apples comparison with jq would be appropriate on the README page (i.e., using the streaming parser in both cases); and (b) some empirical information about memory utilization would also be helpful.

Speaking of documentation, some information about the intended behavior of jj on "wonky" JSON would also be helpful. If the intent is that jj behavior on what I call quasi-JSON is undefined, then so be it :-)

Meanwhile, I'm really impressed that jj is so fast on valid JSON !

pkoppstein avatar Feb 28 '18 07:02 pkoppstein