sophia icon indicating copy to clipboard operation
sophia copied to clipboard

Real benchmark

Open whatvn opened this issue 9 years ago • 10 comments

hello, This is not an issue, but a request which covers real usecase in mind.

I see all benchmark in Sophia Homepage has been done with small key and value, and it's not practical enough.

Could you please do the same benchmarks with key and value larger: 1kb, 5kb, 10kb, 50kb, 100kb...? The reason behind this request is I experienced with various database such as RockDB, LevelDB, LMDB or BDB. All benchmark that listed on there home page is promising, and it's fast with small key, value. But with larger size, all of theme fail to archive that speed.

Thanks,

whatvn avatar May 10 '15 11:05 whatvn

It's all about the use case. Benchmark made on the website is based on real use-case being prepared for production: use sophia to store and search over a huge number of indexed emails.

We should understand that when we try to store bigger values then the game becomes more about throughput (mb/s) then request per-second. Does it really change in that case? In the very best case any persistent and durable database can't write faster then disk speed (considering cache is off) and must deal with Write-Amplification Factor (append-only) vs. Random Access Time (random read or b-tree update).

Anyway, I understand your concern and will try to extend future benchmark coverage. Thanks :)

pmwkaa avatar May 12 '15 09:05 pmwkaa

It'd be nice to include some benchmarks in the website.

arthurprs avatar Jan 26 '16 10:01 arthurprs

there is performance comparsion: http://sophia.systems/performance.html do you mean something else?

pmwkaa avatar Jan 26 '16 10:01 pmwkaa

Sorry, I meant an updated one. This benchmark is nice but it doesn't really give a good overview on the read performance, space amplification and compression of the storage engine. Also, if all reads go to disk it's also very uninteresting as 99,9% of users are trying to avoid that.

arthurprs avatar Jan 26 '16 10:01 arthurprs

I see. Good benchmarks takes a lot of time to prepare and make it right, i'll see what i can do next time. Probably it would be nice to have different read/write patterns, something like what YCSB does.

pmwkaa avatar Jan 26 '16 11:01 pmwkaa

It would be nice to have some sort of poll here, about things that should be present in benchmark:

  • space amplification
  • compression
  • rps, latency
  • different read/write ratios
  • percentiles
  • in-memory modes benchmarking

anything else?

pmwkaa avatar Jan 26 '16 11:01 pmwkaa

I'd say throughput, latency (stalls etc..) and compression are the main selling numbers.

Edit: those on different workloads are even better

I think read/write/space amplification are all nice but take extra effort to measure, also they can be kind of infered by looking at the overall picture.

arthurprs avatar Jan 26 '16 11:01 arthurprs

I believe Mark is a guy that get benchmarks right, example http://smalldatum.blogspot.com.br/2015/06/rocksdb-forestdb-via-forestdb-benchmark.html

arthurprs avatar Jan 26 '16 11:01 arthurprs

Mark is a good example. :)

pmwkaa avatar Jan 26 '16 11:01 pmwkaa

Thanks. If there is a port for Sophia in the ForestDB Benchmark then I will try to run it in the next few months. https://github.com/couchbaselabs/ForestDB-Benchmark

By the way, I am a big fan of Sophia and Tarantool.

On Tue, Jan 26, 2016 at 3:27 AM, Dmitry Simonenko [email protected] wrote:

Mark is a good example. :)

— Reply to this email directly or view it on GitHub https://github.com/pmwkaa/sophia/issues/84#issuecomment-174963204.

Mark Callaghan [email protected]

mdcallag avatar Jan 26 '16 16:01 mdcallag