Performance Benchmarks
While everyone should do their own benchmarking, we need to give developers some where to start. This will anchor expectations and then we can help folks tune their installations as they progress.
Copied over from duplicate:
Create benchmarks that can be executed and measure the performance of Conduit, specifically we are interested in the performance of the pipeline execution (e.g. bytes/s, messages/s). Think about different scenarios (e.g. lots of small messages, a few huge messages, bursts of messages) These tests should give us a good understanding of how changes to Conduit internals impact its performance. In the future we can add a CI job that periodically runs these benchmarks and provides a view into the performance of Conduit over time.
Checklist:
- [x] Ability to run the benchmark manually
- [x] Print msg/s, bytes/s on request
- [x] Verify that pipeline time is correct (i.e. that we have correctly calculated the amount of time records are in the pipeline itself)
- [x] Generator: ability to run multiple sources https://github.com/ConduitIO/conduit-connector-generator/pull/17
- [x] Generator: ability to have bursts: https://github.com/ConduitIO/conduit-connector-generator/pull/20
- [x] Generator: ability to have large payloads https://github.com/ConduitIO/conduit-connector-generator/pull/18
- [x] NoOp destination
- [x] ~~Run the generator as a standalone plugin~~ (we said we're not interested in this)
- [x] Add milliseconds per record in results
- [x] Print results to CSV
- [x] Workload with large files -- script added, but test fails due to https://github.com/ConduitIO/conduit/issues/547
- [x] Run all workloads by default
- [x] Assign a fixed amount of resources (CPU, memory) to Docker containers in which tests are run
Code moved to https://github.com/ConduitIO/streaming-benchmarks.
Complete!