apisix
apisix copied to clipboard
apisix delay
Issue description
hello,
Gateway delay p99 is much higher than other indicators
This is the result of my pressure test using wrk
My routing configuration is

Environment
apisix vesion:apisix-2.6
can you paste the route configuration of zyml-getuserinfo-api as text?
{"id":"373703982817739720","create_time":1632274531,"update_time":1632274531,"uris":["/zymk-getuserinfo-api/"],"name":"zymk-getuserinfo-api","methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"],"plugins":{"prometheus":{"disable":false,"prefer_name":true},"proxy-rewrite":{"headers":{"X-Api-Engine":"apisix","X-Api-Version":"v1"},"regex_uri":["^/zymk-getuserinfo-api/(.)","/$1$2$3"]}},"upstream":{"nodes":[{"host":"172.21.24.51","port":7202,"weight":1}],"timeout":{"connect":6,"read":6,"send":6},"type":"roundrobin","scheme":"http","pass_host":"pass"},"status":1}
@zhixiang0608 Have you ever tried to benchmark the upstream directly? It's unknown whether the long tail requests are due to upstream or APISIX itself.
My reproduction steps are as follows
-
branch: release/2.6
case 1.1: route configuration:
{ "uris": [ "/zymk-getuserinfo-api/" ], "plugins": { "proxy-rewrite": { "headers": { "X-Api-Engine": "apisix", "X-Api-Version": "v1" }, "regex_uri": [ "^/zymk-getuserinfo-api/(.)", "/$1$2$3" ] } }, "upstream": { "nodes": { "127.0.0.1:1980": 1 }, "type": "roundrobin" } }benchmark:
wrk2 -t4 -c60 -d10s -R4000 --latency http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ Running 10s test @ http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ 4 threads and 60 connections Thread Stats Avg Stdev Max +/- Stdev Latency 1.20ms 755.26us 13.91ms 80.72% Req/Sec nan nan 0.00 0.00% Latency Distribution (HdrHistogram - Recorded Latency) 50.000% 1.14ms 75.000% 1.56ms 90.000% 1.93ms 99.000% 2.75ms 99.900% 10.99ms 99.990% 13.77ms 99.999% 13.92ms 100.000% 13.92mscase 1.2: route configuration:
{ "uris": [ "/zymk-getuserinfo-api/" ], "upstream": { "nodes": { "127.0.0.1:1980": 1 }, "type": "roundrobin" } }benchmark:
wrk2 -t4 -c60 -d10s -R4000 --latency http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ Running 10s test @ http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ 4 threads and 60 connections Thread Stats Avg Stdev Max +/- Stdev Latency 1.22ms 563.10us 9.66ms 67.77% Req/Sec nan nan 0.00 0.00% Latency Distribution (HdrHistogram - Recorded Latency) 50.000% 1.19ms 75.000% 1.59ms 90.000% 1.93ms 99.000% 2.62ms 99.900% 3.32ms 99.990% 8.47ms 99.999% 9.67ms 100.000% 9.67ms -
branch: master case 2.1: route configuration:
{ "uris": [ "/zymk-getuserinfo-api/" ], "plugins": { "proxy-rewrite": { "headers": { "X-Api-Engine": "apisix", "X-Api-Version": "v1" }, "regex_uri": [ "^/zymk-getuserinfo-api/(.)", "/$1$2$3" ] } }, "upstream": { "nodes": { "127.0.0.1:1980": 1 }, "type": "roundrobin" } }benchmark:
wrk2 -t4 -c60 -d10s -R4000 --latency http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ Running 10s test @ http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ 4 threads and 60 connections Thread Stats Avg Stdev Max +/- Stdev Latency 1.21ms 663.60us 15.90ms 76.45% Req/Sec nan nan 0.00 0.00% Latency Distribution (HdrHistogram - Recorded Latency) 50.000% 1.17ms 75.000% 1.56ms 90.000% 1.92ms 99.000% 2.65ms 99.900% 7.88ms 99.990% 15.33ms 99.999% 15.90ms 100.000% 15.90mscase 2.2: route configuration:
{ "uris": [ "/zymk-getuserinfo-api/" ], "upstream": { "nodes": { "127.0.0.1:1980": 1 }, "type": "roundrobin" } }benchmark:
wrk2 -t4 -c60 -d10s -R4000 --latency http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ Running 10s test @ http://127.0.0.1:9080/index.html/zymk-getuserinfo-api/ 4 threads and 60 connections Thread Stats Avg Stdev Max +/- Stdev Latency 1.15ms 544.15us 3.64ms 66.16% Req/Sec nan nan 0.00 0.00% Latency Distribution (HdrHistogram - Recorded Latency) 50.000% 1.14ms 75.000% 1.52ms 90.000% 1.86ms 99.000% 2.57ms 99.900% 3.08ms 99.990% 3.44ms 99.999% 3.64ms 100.000% 3.64ms
from case 1.1 and case 1.2, it is related to the regular configuration of the proxy-rewrite plugin from case 1.2 and case 2.2, it is related to apisix version
but this does not explain the 59 ms delay.
Things you need to be sure of.
-
whether
/zymk-getuserinfo-api/is the only route that has this problem, and if so, then it may be related to the upstream of this route. -
remove the prometheus and proxy-rewrite plugins from the
/zymk-getuserinfo-api/route separately to determine if it is an APISIX 2.6 issue or a plugin issue.
possibly relate PR: https://github.com/apache/apisix/pull/4405/files
BTW, do not use wrk to do benchmark, see why:https://www.apiseven.com/zh/blog/analysis-of-inaccurate-distribution-of-wrk-latency, and disable the prometheus plugin, just use wrk2 to do:https://github.com/giltene/wrk2
use https://github.com/tsenart/vegeta
@zhixiang0608 Have you ever tried to benchmark the upstream directly? It's unknown whether the long tail requests are due to upstream or APISIX itself. ther routing delays are normal, only this one,using this formula, I think it has nothing to do with service apisix latency = latency - upstream_reponse_time
can you use openresty-xray to test this route? It's really weird, and it's really weird to remove prometheus and use wrk2.
Receive response body delay from upstream 😂
Receive response body delay from upstream 😂
According to this formula, the time consumption of upstream responders shall be removed

You are saying that some requests are skipped apisix_latency = apisix_latency - upstream_latency? Or maybe some requests don't have upstream_response_time?
Let me think, what is the timeout set upstream?
I'm wondering if it's possible that the upstream response timeout is causing these requests to close.
This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.