`JSON.parse(read(io, String))` is faster than `JSON.parse(io)`
Based on my small test, it looks like JSON.parse(io) is 2x slower, and has 2x more allocations than first reading the IO into memory, and then calling JSON.parse on the resulting String.
Benchmark
using BenchmarkTools
# sample data
const sample = read(download("https://registry.npmjs.org/react"), String)
j1(io) = JSON.parse(read(io, String))
j2(io) = JSON.parse(io)
@benchmark let
io = IOBuffer()
write(io, $sample)
seekstart(io)
r = j1(io) # change to j1 or j2
close(io)
r
end
Results:
JSON.parse(read(io, String))
BenchmarkTools.Trial: 118 samples with 1 evaluation.
Range (min … max): 38.245 ms … 60.927 ms ┊ GC (min … max): 0.00% … 26.18%
Time (median): 39.169 ms ┊ GC (median): 0.00%
Time (mean ± σ): 42.542 ms ± 5.252 ms ┊ GC (mean ± σ): 8.21% ± 9.87%
▅█▁
███▅▅▅▃▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▃▃▃▃▄▁▃▄▃▅▃▄▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▃▁▁▃ ▃
38.2 ms Histogram: frequency by time 56.1 ms <
Memory estimate: 22.80 MiB, allocs estimate: 274058.
JSON.parse(io)
BenchmarkTools.Trial: 69 samples with 1 evaluation.
Range (min … max): 66.057 ms … 85.004 ms ┊ GC (min … max): 0.00% … 21.08%
Time (median): 67.862 ms ┊ GC (median): 0.00%
Time (mean ± σ): 72.698 ms ± 6.664 ms ┊ GC (mean ± σ): 8.17% ± 8.06%
█▃▃ ▁
████▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▄█▃▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▄▇▆▃▇▃▁▁▁▃▁▃ ▁
66.1 ms Histogram: frequency by time 84.5 ms <
Memory estimate: 39.45 MiB, allocs estimate: 458500.
Based on this benchmark, it looks like an easy performance improvement to JSON.jl will be to change this to just read into memory and use the String method, i.e.:
parse(io::IO; kwargs...) = parse(read(io, String); kwargs...)
One consideration is that we probably can't assume that calling read(io, String) is always safe; i.e. it might be a multi-gigabyte file that would completely fill up memory. Or perhaps the io stream is only partially JSON, but might be followed by some other kind of format, so it wouldn't make sense to read the entire IO and assume 100% json.