codis-operator icon indicating copy to clipboard operation
codis-operator copied to clipboard

Specifying loadbalancer for proxy service

Open oruchreis opened this issue 6 years ago • 17 comments

Hi, I'm using default kubernetes deployment scripts from codis repo with small changes on azure aks on production. But I was faced some situations when the proxy or the server has fallen. So I started to search another solution for codis on k8s. I've tried your project, but as far as I saw, it install the services as internal for the cluster. I need LoadBalancer for proxy and fe services, and also I need to add annotations for specifying loadbalancer for internal network use, so azure aks sets this service a static internal ip address. How can I specfy LoadBalancer type for the services and add annotations to the services? Also has there any issue to prevent us to use this on production environments?

oruchreis avatar Dec 29 '18 18:12 oruchreis

thanks for your attention and good advice. I have opened an issue(#19) and will Complete tomorrow at the latest. warning: currently,codis operator is work in progress [WIP] and is NOT ready for production. use at your own risk. you can try it in your test environment.

tangcong avatar Dec 30 '18 05:12 tangcong

best practices:

  • use pv to store Redis data(ssd disk is better)
  • use dedicated node to run codis-server(Redis)
  • set max memory limit(node memory) for codis-server and assign enough memory
  • make sure request resource and limit source are equal(k8s pod qos is guaranteed,evict/oom seldom happens)
  • it is better that if your pod ip is sticky.

tangcong avatar Dec 30 '18 05:12 tangcong

there are some issues remained to be solved:

  • monitor(proxy/redis)
  • dedicated scheduler server(k8s do not know "codis group" conception, one group may have 2-N replicas, we want to make sure that every codis server pod which is in the same group be scheduled into different node, when one node crash/outage,we can promote other slave to master)
  • make sure that drain node safely and automatically.

@oruchreis

tangcong avatar Dec 30 '18 06:12 tangcong

ea5958402ab29e640c95ba924fa8544c96610358 done,you can take a try~ @oruchreis example: https://github.com/tangcong/codis-operator/blob/master/examples/sample-3.yml

tangcong avatar Dec 30 '18 16:12 tangcong

Thanks a lot, I'll try it and notify the result here before long.

oruchreis avatar Dec 31 '18 13:12 oruchreis

Hi, I tried this on a fresh install kubernetes. First I tried without rbac, but codis-fe displayed proxies at timeout state, and there were no server or sentinel. Then I tried with rbac, but it failed again. Also kubernetes dashboard displays pods as healthy. By the way, the serviceAnnotations worked as expected, I could set public and internal ips to the load balancer flawlessly. I've attached codis-operator logs. logs-from-codis-operator-in-codis-operator-0.txt

oruchreis avatar Feb 12 '19 09:02 oruchreis

How to reproduce it (as minimally and precisely as possible)? what is your kubernetes version? can you provide codis-proxy logs and codis-fe snapshot?

tangcong avatar Feb 12 '19 14:02 tangcong

Kubernetes version is 1.12.4. Here is the yaml that I've used which is cloned from sample3: codis-operator.txt Here are the logs from proxy and dashboard. logs.zip I don't know how to get snapshot of codis-fe. Codis-fe has any logs but these: 2019/02/12 08:05:09 main.go:101: [WARN] set ncpu = 2 2019/02/12 08:05:09 main.go:104: [WARN] set listen = 10.90.44.166:9090 2019/02/12 08:05:09 main.go:120: [WARN] set assets = /gopath/src/github.com/CodisLabs/codis/bin/assets 2019/02/12 08:05:09 main.go:162: [WARN] set --etcd = etcd-client:2379

oruchreis avatar Feb 12 '19 19:02 oruchreis

[error]: Get http://web-codis-dashboard.default.svc.cluster.local:18080/api/topom/model: dial tcp: lookup web-codis-dashboard.default.svc.cluster.local: no such host
    4   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:134
            github.com/CodisLabs/codis/pkg/utils/rpc.apiRequestJson
    3   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:169
            github.com/CodisLabs/codis/pkg/utils/rpc.ApiGetJson
    2   /gopath/src/github.com/CodisLabs/codis/pkg/topom/topom_api.go:787
            github.com/CodisLabs/codis/pkg/topom.(*ApiClient).Model
    1   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:329
            main.OnlineProxy
    0   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:289
            main.AutoOnlineWithDashboard

it seems like that codis-proxy can not connect to dashboard and failed to resolve dashboard dns(web-codis-dashboard.default.svc.cluster.local). is the k8s dns service working properly? i have only tested it in k8s 1.10.

tangcong avatar Feb 13 '19 02:02 tangcong

I can connect with curl to this url inside the proxy. I've removed problematic proxies, now only one proxy seems to connected to the dashboard. But there isn't any server or sentinel displayed in the codis-fe. Also I recreated server and sentinel pods but it didn't worked neither. Also I've checked etcd with etcd-browser, it shows only one proxy, no group or server or sentinel. By the way I can create groups and add ip addresses of servers manually.

oruchreis avatar Feb 13 '19 08:02 oruchreis

ERROR: logging before flag.Parse: I0212 06:58:46.134697       1 dashboard.go:59] Successful Create,create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.134740       1 dashboard.go:190] deploy codis dashboard image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.135471       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216557       1 dashboard.go:77] Successful Create,create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216627       1 codiscluster_controller.go:167] reconcile dashboard succ
ERROR: logging before flag.Parse: I0212 06:58:46.217034       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300882       1 proxy.go:72] Successful Create,create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300924       1 proxy.go:284] deploy codis proxy image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.302047       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330207       1 proxy.go:90] Successful Create,create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330360       1 proxy.go:422] codis proxy hpa:{1 3 10}
ERROR: logging before flag.Parse: I0212 06:58:46.331222       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538063       1 proxy.go:108] Successful Create,create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538083       1 codiscluster_controller.go:173] reconcile proxy succ
ERROR: logging before flag.Parse: I0212 06:58:46.538289       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628963       1 fe.go:58] Successful Create,create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628992       1 fe.go:212] deploy codis fe image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.629463       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659217       1 fe.go:76] Successful Create,create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659236       1 codiscluster_controller.go:179] reconcile fe succ
ERROR: logging before flag.Parse: I0212 06:58:46.659890       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693087       1 redis.go:62] Successful Create,create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693128       1 redis.go:224] deploy codis-server image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.693440       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729154       1 redis.go:80] Successful Create,create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729206       1 codiscluster_controller.go:185] reconcile redis succ
ERROR: logging before flag.Parse: I0212 06:58:46.729652       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752893       1 sentinel.go:62] Successful Create,create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752932       1 sentinel.go:220] deploy redis-sentinel image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.753208       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.776349       1 sentinel.go:80] Successful Create,create StatefulSet web-codis-redis-sentinel in CodisCluster web-codis successful

codis-operator log shows that every component create successfully at that time. it is strange that i do not find any error message. every component pods are running? maybe,i have got it, now, you have to create group and add redis/sentinel instance into your cluster manually. later, i will add component into codis-fe automaticly.

tangcong avatar Feb 13 '19 12:02 tangcong

Hımm, then it is my bad. I supposed to see every component onto codis-fe automatically which the k8s yaml scripts at codis repo does that. If I add every component and create groups, then if I scale up or down, will I add new group or create new groups?

oruchreis avatar Feb 17 '19 17:02 oruchreis

yes~ i will add component into codis-fe automaticly as soon as possible.( i am busy recently)

tangcong avatar Feb 18 '19 01:02 tangcong

In my condition, my codis-operator error is "codis-dashboard.codis-operator.svc.cluster.local: no such host", and the error reason is my cluster can not resolve xxx.xxx.svc.cluster.local, it can handle with something like "xxx.xxx.svc.[mycluster].local" instead. Is there a configuration part to change it?

ZhangSIming-blyq avatar Dec 01 '21 06:12 ZhangSIming-blyq

image Hard-coded

ZhangSIming-blyq avatar Dec 01 '21 06:12 ZhangSIming-blyq

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

tangcong avatar Dec 01 '21 06:12 tangcong

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

ok, thanks anyway

ZhangSIming-blyq avatar Dec 01 '21 06:12 ZhangSIming-blyq