libpy_simdjson
libpy_simdjson copied to clipboard
Consider adding GB/s numbers to benchmarks
As is common in Python, you express the benchmark in terms of microseconds per event. I find that this is hard to reason about, and I submit to you that the same is true for most people...
Is 374.2130 microseconds good or bad?
Let us take the twitter result from the README:
>>> gb = 631515 / (1000*1000*1000.)
>>> t= 374.2130 / (1000*1000)
>>> gb/t
1.687581671401047
So you are hitting 1.7 GB/s. That's somewhat easier to reason about. We know that the fastest disks (e.g., from the PlayStation 5) reach 5 GB/s or even better. And so forth. Another good metric might be "seconds per megabytes".
Basically, I submit to you that it is best to normalize the results.
I realize that it is inconvenient because that's not how the tooling is built.