opentelemetry-cpp
opentelemetry-cpp copied to clipboard
Add stress testing framework, with basic metrics example to demonstrate.
Changes
This PR adds a basic stress testing framework to validate the scalability and reliability of the functionality under high-concurrency and long-running workloads. Unlike Google Benchmark, which focuses on micro-benchmarking and latency measurements for isolated operations, this framework tries to simulate sustained, multi-threaded workloads to test a given workload. The idea is to complement the existing benchmarks by adding stress-tests to addressing long-duration and high-concurrency use-cases.
This is already implemented for .Net and Rust, and most of the ideas are taken from there. I felt the need for this to test some optimizations I am doing for metrics, but feel to comment if this doesn't seem helpful.
Also added a basic stress-testing example for metrics to demonstrate. Below are the results from the metrics stress test as an example:
$ ./stress_metrics
Starting stress test with 16 threads...
Throughput: 5009490 it/s | Avg: 4885764 | Min: 4734280 | Max: 5132395
Test completed:
Total iterations: 203373637
Duration: 42 seconds
Average throughput: 4885764 iterations/sec
$
Itβs still in the early stages and will need further enhancements but should be a good starting point. Future improvements could include adding memory and CPU usage information alongside the existing throughput, as well as refining the initial warm-up period to sustain consistent data collection.
Implementation Details:
Worker Threads: - The worker threads (default to number of cores) are spawned to execute the workload. - Each worker thread executes the workload function (func) in a loop until a global STOP flag is set. (ctrl-c) - Each thread maintains its own iteration count to minimize contention.
Throughput Monitoring: - A separate controller thread monitors throughput by periodically summing up iteration counts across threads. - Throughput is calculated over a sliding window (SLIDING_WINDOW_SIZE) and displayed dynamically.
Final Summary: - At the end of the test, the program calculates and prints the total iterations, duration, and average throughput.
For significant contributions please make sure you have completed the following items:
- [ ]
CHANGELOG.mdupdated for non-trivial changes - [ ] Unit tests have been added
- [ ] Changes in public API reviewed
Deploy Preview for opentelemetry-cpp-api-docs canceled.
| Name | Link |
|---|---|
| Latest commit | 4bfadb5ce8bcb0a39f25cb84b6f1d188a7b3f8de |
| Latest deploy log | https://app.netlify.com/projects/opentelemetry-cpp-api-docs/deploys/6864c6a5fc903800086d096c |
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 90.03%. Comparing base (cbfbb02) to head (17fcc54).
Additional details and impacted files
@@ Coverage Diff @@
## main #3241 +/- ##
==========================================
- Coverage 90.06% 90.03% -0.02%
==========================================
Files 220 220
Lines 7069 7069
==========================================
- Hits 6366 6364 -2
- Misses 703 705 +2
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.