libpy_simdjson icon indicating copy to clipboard operation
libpy_simdjson copied to clipboard

Consider adding GB/s numbers to benchmarks

Open lemire opened this issue 5 years ago • 0 comments

As is common in Python, you express the benchmark in terms of microseconds per event. I find that this is hard to reason about, and I submit to you that the same is true for most people...

Is 374.2130 microseconds good or bad?

Let us take the twitter result from the README:

>>> gb = 631515 / (1000*1000*1000.)
>>> t=  374.2130 / (1000*1000)
>>> gb/t
1.687581671401047

So you are hitting 1.7 GB/s. That's somewhat easier to reason about. We know that the fastest disks (e.g., from the PlayStation 5) reach 5 GB/s or even better. And so forth. Another good metric might be "seconds per megabytes".

Basically, I submit to you that it is best to normalize the results.

I realize that it is inconvenient because that's not how the tooling is built.

lemire avatar Jul 16 '20 16:07 lemire