rails_event_store Benchmarks for RES

The discussion on the implementation of optimistic locking made me think that it would be beneficial to have an automated way to benchmark RES. Reasons are pretty obvious

to make sure there are no performance regressions
to have an ability to compare solutions and pick the best one
to prove or disprove that a certain patch brings performance benefits

I came up with the list of the following requirements:

benchmark real-world scenarios (as much as possible)
make benchmarks reproducible
make it easy to compare multiple versions of RES
make it easy to specify RES source and version. I.e. branch, rubygems release, path on the disk, etc

This is an initial attempt to implement such solution https://github.com/mlomnicki/res-benchmarks

For now there are 2 benchmarks

publish events to stream
read 50 events from stream

It benchmarks the following RES versions

v0.9 to v0.15
master branch
locking_friendly branch

The benchmarks have already proved to be useful. The "read" benchmark discovered that there was a major regression in RES 0.12.

Reading events https://benchmark.fyi/T

As you can see the last version where reading events is efficient is 0.11. In 0.12 reading from stream became 42x slower.

No surprises when it comes to publishing events http://benchmark.fyi/U

I'd like to get some feedback from you. In particular:

do you think it's useful?
what to benchmark?
should the code live in a separate repo? moved to the RES organization? added to the rails_event_store repo?

Obviosuly there's a huge room for improvement

show results on nice charts
plug into CI and benchmark every commit and/or release
benchmark with mysql and postgres

Feedback is more than welcome

Aug 31 '17 21:08 mlomnicki

Sounds useful as a way to know whether there was a regression. Reminds me a bit of http://speed.pypy.org

It would be nice as a continous job and published results somewhere. Maybe we can already store build artifacts from Travis runs (benchmark results in JSON) and present them grouped in a webapp in the future?

Aug 31 '17 21:08 mostlyobvious

I acknowledge that CI performance will vary but it should be enough to catch serious performance regression.

Aug 31 '17 21:08 mostlyobvious

store build artifacts from Travis runs (benchmark results in JSON) and present them grouped in a webapp in the future

yep that would be great. I imagine that the webapp would be built with RES. That would be quite meta :smiley:

Aug 31 '17 21:08 mlomnicki

So I had a closer look at the alleged issue that reading events in RES 0.12 is 42 times slower than in RES 0.11. It turned out that indeed there is an issue but in my benchmark, not in RES.

Correct results below:

ReadStream http://benchmark.fyi/V

What was the issue then?

The API in RES 0.12 has changed from

def publish_event(event, stream_name)

to keyword arguments

def publish_event(event, stream_name:)

The benchmark assumed keyword arguments for all RES versions though. It's a valid assumption for RES >= 0.12 but here's what happens in RES <= 0.11

# it passes a hash as a stream name. Not a string!
es.publish_event(some_event, stream_name: "some_stream")

# INSERT INTO event_store_events (stream_name, ...)
#   VALUES ('{"stream_name" => "some_stream"}', ...)

# Now this reads zero events because "some_stream" is empty
es.read_stream_events_formward("some_stream")

In other words the benchmark stated that reading zero events is 42x faster than reading 50 events. Thank you captain obvious.

Fix: https://github.com/mlomnicki/res-benchmarks/commit/66264834a5653fc207274e702eddb013648b9659

Lesson learnt: TDD your benchmarks? 🤔

Sep 05 '17 08:09 mlomnicki

@mlomnicki I think it would be very beneficial and it makes sense. As to HOW...

I believe the mono-repo could have a benchmark directory and would know how to run certain standardized test in it's current version.

A separate repo above would know how to checkout to specific versions of the gem and run the benchmark.

I believe it might have been a better solution because one would need to continue keep the benchmarks working.

The downside is, it would be hard to benchmark historical code with a certain scenario, unless we maintain that scenario on many branches.

Sep 09 '17 15:09 paneq

What kind of benchmarks would you like to see?

reading 10, 100, 1000, 10_000, 100_000 events?
reading with different serializers?
writing with different serializers?
writing with different strategies?
writing with different repositories (ie. PgLinearized)?

Oct 09 '18 10:10 mostlyobvious

@pawelpacana all sound nice

writing with different strategies? writing with different repositories (ie. PgLinearized)?

These might require multiple producers and contention to see any actual effect.

Oct 09 '18 10:10 paneq

https://rubybench.org could be a good start

Dec 04 '18 22:12 mostlyobvious