apollo
apollo copied to clipboard
k8s集群环境多副本pod无法登陆
版本信息 k8s version: 1.14.1 apollo version:1.7.1 traefik version: 1.7 helm chart version: 0.1.1
问题描述
参考这里:https://github.com/ctripcorp/apollo/wiki/%E5%88%86%E5%B8%83%E5%BC%8F%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97#241-%E5%9F%BA%E4%BA%8Ekubernetes%E5%8E%9F%E7%94%9F%E6%9C%8D%E5%8A%A1%E5%8F%91%E7%8E%B0 部署了apollo后,并集成了ldap(windows ad),apollo-portal如果配置多pod副本,系统无法登陆, 修改为单副本的pod,就可以正常登陆。
看官方配置只有nginx ingress controller的例子,我们实际生成环境用的traefik ingress controller,我依照官方例子修改如下信息以后还是无法登陆;实际测试中通过chrome debug可以看出其实已经登陆成功,只是没有跳转到后台,不知道哪里的问题,还请帮忙解惑。
helm chart参考这里:https://github.com/ctripcorp/apollo/tree/master/docs/charts
ingress配置如下:
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
field.cattle.io/ingressState: '{"YXBvbGxvLXBvcnRhbC9tYWxsL2FpYmVlLmNuLy8vODA3MA==":""}'
traefik.ingress.kubernetes.io/affinity: "true"
traefik.ingress.kubernetes.io/ingress.class: traefik
traefik.ingress.kubernetes.io/load-balancer-method: drr
traefik.ingress.kubernetes.io/max-conn-amount: "10"
traefik.ingress.kubernetes.io/session-cookie-name: JSESSIONID
creationTimestamp: "2020-12-22T07:02:24Z"
generation: 3
labels:
app.kubernetes.io/version: 1.7.1
name: apollo-portal
namespace: mall
resourceVersion: "336484675"
selfLink: /apis/extensions/v1beta1/namespaces/mall/ingresses/apollo-portal
uid: a48ff897-4423-11eb-b624-ac1f6b6ca72e
spec:
rules:
- host: conf.test.cn
http:
paths:
- backend:
serviceName: apollo-portal
servicePort: 8070
path: /
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
service配置如下:
apiVersion: v1
kind: Service
metadata:
annotations:
field.cattle.io/ipAddresses: "null"
field.cattle.io/targetDnsRecordIds: "null"
field.cattle.io/targetWorkloadIds: "null"
creationTimestamp: "2020-12-22T07:02:23Z"
labels:
app.kubernetes.io/version: 1.7.1
name: apollo-portal
namespace: mall
resourceVersion: "336490784"
selfLink: /api/v1/namespaces/mall/services/apollo-portal
uid: a47e04ef-4423-11eb-a028-ac1f6b6cd636
spec:
clusterIP: 10.68.65.74
ports:
- name: http
port: 8070
protocol: TCP
targetPort: 8070
selector:
app: apollo-portal
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
type: ClusterIP
status:
loadBalancer: {}
pod信息:
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
apollo-adminservice-86f68f989b-hpv4j 1/1 Running 0 3h54m
apollo-adminservice-86f68f989b-rv9bd 1/1 Running 0 3h54m
apollo-configservice-7467fb54f8-bs7dx 1/1 Running 0 3h54m
apollo-configservice-7467fb54f8-zbpxm 1/1 Running 0 3h54m
apollo-portal-57ddbf686f-p4vcm 1/1 Running 0 32m
apollo-portal-57ddbf686f-trb8z 1/1 Running 0 4m14s
➜ kubectl get ing
NAME HOSTS ADDRESS PORTS AGE
apollo-portal conf.test.cn 80 178m
➜ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
apollo-adminservice ClusterIP 10.68.6.254 <none> 8090/TCP 4h1m
apollo-configdb ClusterIP 10.68.95.70 <none> 3306/TCP 4h1m
apollo-configservice ClusterIP 10.68.53.26 <none> 8080/TCP 4h1m
apollo-portal ClusterIP 10.68.65.74 <none> 8070/TCP 178m
apollo-portaldb ClusterIP 10.68.108.159 <none> 3306/TCP 178m
@nobodyiam 大佬有遇到这种问题吗?
@iwz2099 我们之前测试的是基于 nginx controller 的,可以根据文档换用 nginx 试试?原理都是一样的,就是 ingress 转发的时候做一下 session sticky,不过 traefik 的配置不太熟。。
我们也遇到了这个问题,推测是portal多实例时,各个实例之间session没有共享,我们没有做session sticky,只启动了一个实例,portal压力也不会很大,而且挂了也不会影响业务,所以可以忍受
这个用ingress的会话保持就可以了,具体添加代码如下: metadata: annotations: nginx.ingress.kubernetes.io/affinity: "cookie" # 解决会话保持 nginx.ingress.kubernetes.io/session-cookie-name: "route" nginx.ingress.kubernetes.io/session-cookie-expires: "172800" nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"
遇到相同的问题,举爪...
@nobodyiam @JamalyYao
各位大大已经解决了,忘了补充了,参考traefik官方文档 https://doc.traefik.io/traefik/v1.7/configuration/backends/kubernetes/
让配置成
traefik.ingress.kubernetes.io/affinity: "true"
其实有一定误导性,换回sticky就可以了,下面是我的配置。
Ingress配置:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
traefik.backend.loadbalancer.sticky: "true"
traefik.ingress.kubernetes.io/ingress.class: traefik
traefik.ingress.kubernetes.io/load-balancer-method: drr
traefik.ingress.kubernetes.io/max-conn-amount: "1000"
traefik.ingress.kubernetes.io/session-cookie-name: JSESSIONID
Service配置:
apiVersion: v1
kind: Service
metadata:
annotations:
traefik.backend.loadbalancer.sticky: "true"
traefik.ingress.kubernetes.io/load-balancer-method: drr
traefik.ingress.kubernetes.io/session-cookie-name: JSESSIONID`
之前也有好多提过类似的问题,基本上在LB层按源ip hash到portal实例就能解决