corvus
corvus copied to clipboard
在redis_cluster集群基础上安装了corvus后为啥tps降低了
我创建了3个主节点 3个从节点 ,然后配置如下: bind 12345 node 127.0.0.1:6379,127.0.0.1:6380,127.0.0.1:6381 thread 4 启动后用redis-benchmark压测发现tps降了 不用corvus是这样 GET: 33046.93 requests per second 用了corvus后 GET: 5262.88 requests per second
我也是遇到相同的问题,用corvus代理集群居然比直接访问差好多,请问你后面是怎么解决的
还没解决
这样的情况下,需要具体研究,一般是三个大方向:
Ping值
因为加了中间层,所以需要具体调研链路情况,从client -> redis变成了client -> corvus -> redis,简单的测试方法,就是单线程地调用一定次数(例如10000次)使用平均值来对比。
如果Ping值提高了(较为显著),需要先去解决这个问题,但此时并发能力并无太大变化。
带宽
参考Ping值
多例
corvus被设计为无状态,可多例做负载均衡,例如三个实例来同时做压力均衡,这种模式下再做压测。
maybe because of redis pipeline? https://github.com/eleme/corvus/blob/df06ef75644cd100f13ac6224f4cd77c806c3080/src/client.c#L198
For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.
We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.
So could we change the code to remove the wait command queue and reply to clinet immediately? or how can we contribute? @jasonjoo2010
maybe because of redis pipeline?
https://github.com/eleme/corvus/blob/df06ef75644cd100f13ac6224f4cd77c806c3080/src/client.c#L198
For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.
We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.
So could we change the code to remove the wait command queue and reply to clinet immediately? or how can we contribute? @jasonjoo2010
I think things are not so simple.
I just do a local test with 6 nodes which the original author mentioned and only ONE corvus proxy (also locally) with 4 threads and ONLY do tests "get" and "set" (I will explain it later) and the results are acceptable:
4 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 4 -n 1000000 -t get,set
====== SET ======
1000000 requests completed in 21.22 seconds
4 parallel clients
3 bytes payload
keep alive: 1
47123.13 requests per second
====== GET ======
1000000 requests completed in 17.19 seconds
4 parallel clients
3 bytes payload
keep alive: 1
58180.12 requests per second
10 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 10 -n 1000000 -t get,set
====== SET ======
1000000 requests completed in 13.09 seconds
10 parallel clients
3 bytes payload
keep alive: 1
76382.52 requests per second
====== GET ======
1000000 requests completed in 11.92 seconds
10 parallel clients
3 bytes payload
keep alive: 1
83899.66 requests per second
20 clients benchmark: redis-benchmark -h 127.0.0.1 -p 6666 -c 20 -n 1000000 -t get,set
====== SET ======
1000000 requests completed in 11.78 seconds
20 parallel clients
3 bytes payload
keep alive: 1
84904.06 requests per second
====== GET ======
1000000 requests completed in 10.20 seconds
20 parallel clients
3 bytes payload
keep alive: 1
98048.83 requests per second
For comparing I also test with the original redis nodes and fill them into a table:
GET:
| clients | redis | proxy |
|---|---|---|
| 4 | 108448.11 | 58180.12 |
| 10 | 113019.90 | 83899.66 |
| 20 | 111969.55 | 98048.83 |
SET:
| clients | redis | proxy |
|---|---|---|
| 4 | 102997.23 | 47123.13 |
| 10 | 108389.34 | 76382.52 |
| 20 | 112019.72 | 84904.06 |
The capacity is so constant when connecting to redis directly.
Why?
Because redis-benchmark doesn't support cluster until 6.0 but I just try 6.0-rc1 and got core dump. So I don't dig it deeper because the test results under proxy are making sense for me. (Though threy are locally)
I think the guy who submitted this issue may not really put the load to redis (most reply got MOVED so it's fast).
What should be really paid attention is that the key point I mentioned in previous post: Don't miss the physical latency in the transferring route.
And attach the cluster booting commands for reference:
redis-server --port 6380 --cluster-enabled yes --cluster-config-file nodes-6380.conf
redis-server --port 6381 --cluster-enabled yes --cluster-config-file nodes-6381.conf
redis-server --port 6382 --cluster-enabled yes --cluster-config-file nodes-6382.conf
redis-server --port 6383 --cluster-enabled yes --cluster-config-file nodes-6383.conf
redis-server --port 6384 --cluster-enabled yes --cluster-config-file nodes-6384.conf
redis-server --port 6385 --cluster-enabled yes --cluster-config-file nodes-6385.conf
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6381
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6382
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6383
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6384
redis-cli -h 127.0.0.1 -p 6380 cluster meet 127.0.0.1 6385
redis-cli -h 127.0.0.1 -p 6380 cluster addslots $(seq 0 5000)
redis-cli -h 127.0.0.1 -p 6381 cluster addslots $(seq 5001 11000)
redis-cli -h 127.0.0.1 -p 6382 cluster addslots $(seq 11001 16383)
redis-cli -h 127.0.0.1 -p 6383 cluster replicate $(redis-cli -h 127.0.0.1 -p 6383 cluster nodes |grep '0-5000' |awk '{print $1}')
redis-cli -h 127.0.0.1 -p 6384 cluster replicate $(redis-cli -h 127.0.0.1 -p 6384 cluster nodes |grep '5001-11000' |awk '{print $1}')
redis-cli -h 127.0.0.1 -p 6385 cluster replicate $(redis-cli -h 127.0.0.1 -p 6385 cluster nodes |grep '11001-16383' |awk '{print $1}')
# because the config file cannot be omitted so I just make a simply config with "bind 6666" in it
corvus -b 6666 -c default -t 4 -n 127.0.0.1:6380,127.0.0.1:6381 a.conf
maybe because of redis pipeline?
https://github.com/eleme/corvus/blob/df06ef75644cd100f13ac6224f4cd77c806c3080/src/client.c#L198
For example, if Corvus reveive 3 command in the pipeline. Then Corvus will wait for the response for all the three command. Then send back to client.
We had did some test, if we do not use this command queue to wait all the response. This will improve performance a lot.
So could we change the code to remove the wait command queue and reply to clinet immediately? or how can we contribute? @jasonjoo2010
And more as you mentioned in my opinion pipeline is not a correct scenario to corvus as publish/subscribe. We have implemented a branch supporting pub/sub but now we use it in client side directly (Cluster jedis client for example). corvus is efficient for those legacy projects but if you can support redis cluster directly just connect directly.
For HA purpose we have an automatic publishing agent which operating backends in lvs/dpvs[1] and the smart clients in application can always keep synchronized with the cluster.
If you are in any kind of cloud environment you can develop the agent operating the load balancer using cloud api to make the VIP updated.
[1] http://github.com/iqiyi/dpvs
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
Nope
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).
By default benchmark will not enable pipeline feature.
And what I agree with you is that it's better connect to redis cluster directly when we want use this feature.
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
Nope
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).By default benchmark will not enable pipeline feature.
I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.
I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
Nope
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).By default benchmark will not enable pipeline feature.
I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic.
I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance
Yeah you're right here. It would be great if you can shot it a patch on it.
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
Nope
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).By default benchmark will not enable pipeline feature.
I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic. I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance
Yeah you're right here. It would be great if you can shot it a patch on it.
Our change may introduce bugs. Maybe we can ask if the maintainer could fix that.
Or maybe they have their reason to do so......
Btw who is the acitve maintainer of this rep now?
with 6 nodes which the original author mentioned and only ONE corvus pro
I mean that when you using benchmark. It send message very fast, then it will trigger redis-pipeline feature. Benchmark will send serveral RSP message in one time. Corvus will wait until all the response arrive.
Nope
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).By default benchmark will not enable pipeline feature.
I just want to mention that - the code I mentioned will hold severial request and will enlarge the latency. Corvus will get better performance after we change this logic. I am asking the maintainer if they can change or I can contribute. You can do the same thing to test if you can get better performance
Yeah you're right here. It would be great if you can shot it a patch on it.
Our change may introduce bugs. Maybe we can ask if the maintainer could fix that.
Or maybe they have their reason to do so......
Btw who is the acitve maintainer of this rep now?
So far as I know they have turned to use semi-client-side layer like sidecar in containers environment. Something more like connecting to cluster directly. We can cue @tevino
Yes, it's here, take a look.