Robyn
Robyn copied to clipboard
Response refactor
Actually increases performance in static files. The static files are now async & JSON is stringified rust side now.
Breaking change just responding with text is now not allowed.
I have experimented with removing the thread spawn on the blocking request makes synchronous requests much faster
Example speed we can get with sync text requests: PR results
oha -n 100000 http://localhost:5000/sync
Summary:
Success rate: 1.0000
Total: 3.0909 secs
Slowest: 0.0104 secs
Fastest: 0.0001 secs
Average: 0.0015 secs
Requests/sec: 32352.5699
Total data: 1.71 MiB
Size/request: 17 B
Size/sec: 565.19 KiB
Response time histogram:
0.000 [3197] |■■■
0.001 [16553] |■■■■■■■■■■■■■■■■■■■
0.001 [27004] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.002 [22691] |■■■■■■■■■■■■■■■■■■■■■■■■■■
0.002 [13940] |■■■■■■■■■■■■■■■■
0.002 [7593] |■■■■■■■■
0.003 [3976] |■■■■
0.003 [2006] |■■
0.004 [1128] |■
0.004 [599] |
0.005 [1313] |■
Latency distribution:
10% in 0.0007 secs
25% in 0.0010 secs
50% in 0.0014 secs
75% in 0.0019 secs
90% in 0.0025 secs
95% in 0.0030 secs
99% in 0.0045 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0025 secs, 0.0009 secs, 0.0037 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0001 secs
Status code distribution:
[200] 100000 responses
Current master result:
oha -n 100000 http://localhost:5001/sync
Summary:
Success rate: 1.0000
Total: 3.6854 secs
Slowest: 0.0123 secs
Fastest: 0.0001 secs
Average: 0.0018 secs
Requests/sec: 27134.0017
Total data: 1.81 MiB
Size/request: 19 B
Size/sec: 503.46 KiB
Response time histogram:
0.000 [3007] |■■■
0.001 [13101] |■■■■■■■■■■■■■■■
0.001 [26798] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.002 [25279] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.002 [15623] |■■■■■■■■■■■■■■■■■■
0.003 [8099] |■■■■■■■■■
0.003 [4122] |■■■■
0.004 [2058] |■■
0.004 [919] |■
0.005 [446] |
0.005 [548] |
Latency distribution:
10% in 0.0009 secs
25% in 0.0013 secs
50% in 0.0017 secs
75% in 0.0022 secs
90% in 0.0029 secs
95% in 0.0034 secs
99% in 0.0045 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0017 secs, 0.0008 secs, 0.0023 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0002 secs
Status code distribution:
[200] 100000 responses
Hey @JackThomson2 , thank you for your PR. I have limited access , so I will do a complete review on 2nd.
Can you give an example of the current main branch on your machine so that it is easier to make comparison with this:
Example speed we can get with sync text requests:
Also, I think if we use the CatchAll
method as mentioned in the PyO3 docs we can remove the need to wrap the text responses in the text function.
Also, I think if we use the
CatchAll
method as mentioned in the PyO3 docs we can remove the need to wrap the text responses in the text function.
That is awesome, I completely missed that I'll take a look at that later
@sansyrox I updated the comment with results from master. Interestingly this is even with the new one doing extra work as it is doing more checks in the flow
I've been playing with a few unsafer options which give crazy results as below: Pushing 40k results per second
oha -n 100000 http://localhost:5000/static
Summary:
Success rate: 1.0000
Total: 2.4989 secs
Slowest: 0.0083 secs
Fastest: 0.0001 secs
Average: 0.0012 secs
Requests/sec: 40017.9414
Total data: 0 B
Size/request: 0 B
Size/sec: 0 B
Response time histogram:
0.000 [3409] |■■■■
0.001 [13039] |■■■■■■■■■■■■■■■■
0.001 [25930] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.001 [25727] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.002 [16145] |■■■■■■■■■■■■■■■■■■■
0.002 [8100] |■■■■■■■■■
0.002 [4040] |■■■■
0.003 [1848] |■■
0.003 [808] |
0.003 [391] |
0.004 [563] |
Latency distribution:
10% in 0.0006 secs
25% in 0.0009 secs
50% in 0.0012 secs
75% in 0.0015 secs
90% in 0.0019 secs
95% in 0.0023 secs
99% in 0.0030 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0023 secs, 0.0014 secs, 0.0032 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0001 secs
@sansyrox I updated the comment with results from master. Interestingly this is even with the new one doing extra work as it is doing more checks in the flow
Nicee 🔥
I've been playing with a few unsafer options which give crazy results as below: Pushing 40k results per second
I have usually stayed away from the unsafe rust option as I used rust for its safety and performance. But how unsafe are we talking about?
@sansyrox I updated the comment with results from master. Interestingly this is even with the new one doing extra work as it is doing more checks in the flow
Nicee 🔥
I've been playing with a few unsafer options which give crazy results as below: Pushing 40k results per second
I have usually stayed away from the unsafe rust option as I used rust for its safety and performance. But how unsafe are we talking about?
So what I've been doing is using the assume_gil_acquired
method which faux acquires the GIL. I've used this on objects which have been passed as parameters, so to my understanding we shoulding hit any threading issues as the value won't be modified after it's sent. I still lock on the function call
So what I've been doing is using the assume_gil_acquired method which faux acquires the GIL. I've used this on objects which have been passed as parameters, so to my understanding we shoulding hit any threading issues as the value won't be modified after it's sent. I still lock on the function call
Oh, then it is not a big issue. I used to do it as well but when I talked with the PyO3 folks, they suggested me to use py.allow_threads
since I would have to acquire gil anyway(for the function call in your case). So, as long as we can ensure that the app won't crash, it is not an issue.
Here's an example of what my experiment in now pushing:
oha -n 100000 http://localhost:5000/static
Summary:
Success rate: 1.0000
Total: 1.3311 secs
Slowest: 0.0129 secs
Fastest: 0.0000 secs
Average: 0.0007 secs
Requests/sec: 75123.6686
Total data: 585.94 KiB
Size/request: 6 B
Size/sec: 440.18 KiB
We're now near native rust performance, which is crazy basically a 2x perf boost over current native. Rust gave 90k/sec
@JackThomson2 , this is very crazy indeed. What kind of performance are you getting on a native actix server on your pc?
Also, I think if we use the CatchAll method as mentioned in the PyO3 docs we can remove the need to wrap the text responses in the text function.
@JackThomson2 , did you take a look at this?
Hi yes, sorry I've been a bit busy. But yes I managed to get that to work, but it took a performance penalty. I was just trying to figure a way to remove that overhead
Hi @JackThomson2 👋 . Did you get a chance to look at this PR?
Eight months ago:
Hi @JackThomson2 wave . Did you get a chance to look at this PR?
Does we can close this merge request, please do.
Stale and implemented in v0.37.0