valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Add PR and Release benchmark with new changes in framework

Open roshkhatri opened this issue 1 month ago • 3 comments

This adds the workflow improvements for PR and Release benchmark where it runs on 'c8g.metal-48xlforARM64andc7i.metal-48xlforX86`

Cluster mode: disabled
TLS: disabled
io-threads: 1, 9
Pipelining: 1, 10
Clients: 1600
Benchmark Treads: 90
Data size: 16 ,96
Commands: SET, GET

c8g.metal-48xl Spec: https://aws.amazon.com/ec2/instance-types/c8g/ c7i.metal.48xl Spec: https://aws.amazon.com/ec2/instance-types/c7i/

vCPU: 192
NUMA nodes: 2
Memory (GiB): 384
Network Bandwidth (Gbps): 50

PR benchmarking will be executed on ARM64 machine as it has been seen to be more consistent. Additionally, it runs 5 iterations for each tests and posts the average and other statistical metrics like

  • CI99%: 99% Confidence Interval - range where the true population mean is likely to fall
  • PI99%: 99% Prediction Interval - range where a single future observation is likely to fall
  • CV: Coefficient of Variation - relative variability (σ/μ × 100%)

Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges.

For comparing between versions, it adds a workflow which runs on both ARM64 and X86 machine. It will also post the comparison between the versions like this: https://github.com/valkey-io/valkey/issues/2580#issuecomment-3399539615

roshkhatri avatar Nov 25 '25 00:11 roshkhatri

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 72.43%. Comparing base (8ea7f13) to head (cec12f6). :warning: Report is 19 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2871      +/-   ##
============================================
- Coverage     72.44%   72.43%   -0.01%     
============================================
  Files           128      128              
  Lines         70415    70439      +24     
============================================
+ Hits          51011    51026      +15     
- Misses        19404    19413       +9     

see 19 files with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Nov 25 '25 02:11 codecov[bot]

@roshkhatri looks like we are adding x86 as well? I am not sure if we should add x86 runs if we are not confident on the stability and numbers yet.

sarthakaggarwal97 avatar Nov 25 '25 21:11 sarthakaggarwal97

@roshkhatri looks like we are adding x86 as well? I am not sure if we should add x86 runs if we are not confident on the stability and numbers yet.

Yes, we would still like to get the benchmark numbers for X86, while doing the releases. The PR only used ARM64 though

roshkhatri avatar Nov 26 '25 03:11 roshkhatri