JetStream
JetStream copied to clipboard
Understanding the intuition behind `request-rate`
I have conducted an analysis of the request-rate
and interval
variables in the benchmarking_script.py
and would like to ensure that my understanding is correct.
My understanding is that the request-rate
parameter introduces delays between each request to mimic the queries-per-second
(QPS
) rate. For example, if it is set to 5, then 5 requests are sent within a 1-second window for a sufficiently large number of samples.
That being said, the delay implementation generates random delay values with slightly high variance, but the average is fairly consistent:
interval = np.random.exponential(1.0 / request_rate)
The graph of exponential distribution
from which 1/request-rate
is being sampled
When I plot the interval
values for a given request-rate (e.g., 5), I get the following plot after running it 1000 times:
With these statistics:
Mean: 0.19030281331325313
Variance: 0.03450950095960781
Standard Deviation: 0.18576733017300917
Minimum: 6.344257769502491e-05
Maximum: 1.2152877332855887
Sum: 190.30281331325313
Given that the mean is around 0.2, the overall QPS is 5 requests per second (since 5 × 0.2 = 1 second).
Here are the statistics for request-rate = 10
:
**Mean: 0.0983383984114717**
Variance: 0.008309511474488432
Standard Deviation: 0.0911565218428634
Minimum: 3.9151506088517006e-05
Maximum: 0.6463796229426464
Sum: 98.3383984114717
In conclusion, the request-rate
parameter effectively mimics the QPS (queries per second) metric when the number of samples is large enough.
I would like to confirm that my understanding is correct and document this in the issues section for anyone else who might have the same question.