JetStream Understanding the intuition behind `request-rate`

Understanding the intuition behind `request-rate`

Open hosseinsarshar opened this issue 5 months ago • 0 comments

I have conducted an analysis of the request-rate and interval variables in the benchmarking_script.py and would like to ensure that my understanding is correct.

My understanding is that the request-rate parameter introduces delays between each request to mimic the queries-per-second (QPS) rate. For example, if it is set to 5, then 5 requests are sent within a 1-second window for a sufficiently large number of samples.

That being said, the delay implementation generates random delay values with slightly high variance, but the average is fairly consistent:

interval = np.random.exponential(1.0 / request_rate)

The graph of exponential distribution from which 1/request-rate is being sampled

When I plot the interval values for a given request-rate (e.g., 5), I get the following plot after running it 1000 times:

With these statistics:

Mean: 0.19030281331325313
Variance: 0.03450950095960781
Standard Deviation: 0.18576733017300917
Minimum: 6.344257769502491e-05
Maximum: 1.2152877332855887
Sum: 190.30281331325313

Given that the mean is around 0.2, the overall QPS is 5 requests per second (since 5 × 0.2 = 1 second).

Here are the statistics for request-rate = 10:

**Mean: 0.0983383984114717**
Variance: 0.008309511474488432
Standard Deviation: 0.0911565218428634
Minimum: 3.9151506088517006e-05
Maximum: 0.6463796229426464
Sum: 98.3383984114717

In conclusion, the request-rate parameter effectively mimics the QPS (queries per second) metric when the number of samples is large enough.

I would like to confirm that my understanding is correct and document this in the issues section for anyone else who might have the same question.

Sep 11 '24 21:09 hosseinsarshar

JetStream JetStream copied to clipboard

Understanding the intuition behind `request-rate`

JetStream
JetStream copied to clipboard