JetStream icon indicating copy to clipboard operation
JetStream copied to clipboard

Understanding the intuition behind `request-rate`

Open hosseinsarshar opened this issue 5 months ago • 0 comments

I have conducted an analysis of the request-rate and interval variables in the benchmarking_script.py and would like to ensure that my understanding is correct.

My understanding is that the request-rate parameter introduces delays between each request to mimic the queries-per-second (QPS) rate. For example, if it is set to 5, then 5 requests are sent within a 1-second window for a sufficiently large number of samples.

That being said, the delay implementation generates random delay values with slightly high variance, but the average is fairly consistent:

interval = np.random.exponential(1.0 / request_rate)

image

The graph of exponential distribution from which 1/request-rate is being sampled


When I plot the interval values for a given request-rate (e.g., 5), I get the following plot after running it 1000 times:

image

With these statistics:

Mean: 0.19030281331325313
Variance: 0.03450950095960781
Standard Deviation: 0.18576733017300917
Minimum: 6.344257769502491e-05
Maximum: 1.2152877332855887
Sum: 190.30281331325313

Given that the mean is around 0.2, the overall QPS is 5 requests per second (since 5 × 0.2 = 1 second).


Here are the statistics for request-rate = 10:

image
**Mean: 0.0983383984114717**
Variance: 0.008309511474488432
Standard Deviation: 0.0911565218428634
Minimum: 3.9151506088517006e-05
Maximum: 0.6463796229426464
Sum: 98.3383984114717

In conclusion, the request-rate parameter effectively mimics the QPS (queries per second) metric when the number of samples is large enough.

I would like to confirm that my understanding is correct and document this in the issues section for anyone else who might have the same question.

hosseinsarshar avatar Sep 11 '24 21:09 hosseinsarshar