k8s-bigip-ctlr icon indicating copy to clipboard operation
k8s-bigip-ctlr copied to clipboard

Big ip controller sends an AS3 Pool with no pool members when leader elector is used

Open samgabriel opened this issue 1 year ago • 6 comments

Setup Details

CIS Version : 2.15.0 Build: f5networks/k8s-bigip-ctlr:latest
BIGIP Version: Big IP 16 AS3 Version: 3.26 Agent Mode: AS3
Orchestration: K8S Orchestration Version: 1.20 Pool Mode: Nodeport
Additional Setup details: <Platform/CNI Plugins/ cluster nodes/ etc> weavenetworks

Description

Steps To Reproduce

  1. Create a Pod and service that uses leader-elector with a service definition as such
apiVersion: v1
kind: Service
metadata:
  name: quartz-web
  labels:
    app: quartz-web
spec:
  type: NodePort
  selector:
    app: quartz-web
    leader: "yes"
  externalTrafficPolicy: Local
  ports:
  - port: 80
    name: web
    targetPort: 8080
    nodePort: 30065
  1. Leader-elector code update the annotation on the pod every few seconds adding a lease record to ensure this is the leader pod
  2. This works fine in version 1.14 trying to upgrade to version 2.0 we are getting the vs and pools created on the big ip side but with no pool members. The configuration used is this

apiVersion: "cis.f5.com/v1"
kind: TransportServer
metadata:
  name: quartz.http.vs
  labels:
    f5cr: "true"
spec:
  mode: standard
  virtualServerName: quartz-web-2
  virtualServerAddress: "_removed_"
  virtualServerPort: _removed_
  pool:
    service:  quartz-web
    servicePort: "web"
    monitors:
      - type: http
        send: /ui
        recv: ""
        interval: 30
        timeout: 120

Expected Result

Virtual Server is created with pool containing the nodes

Actual Result

Diagnostic Information

We are seeing this in the controller logs every few seconds


2023/12/16 00:01:52 [DEBUG] Enqueueing Endpoints: &Endpoints{ObjectMeta:{quartz-web  default  4b9ef866-8c8e-491f-97eb-d0a5aca71534 402877503 0 2022-07-18 15:58:26 +0000 UTC <nil> <nil> map[app:quartz-web] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"quartz-1","leaseDurationSeconds":10,"acquireTime":"2023-12-06T19:00:42Z","renewTime":"2023-12-16T00:01:52Z","leaderTransitions":0} endpoints.kubernetes.io/last-change-trigger-time:2023-12-15T18:39:37Z] [] []  [{kube-controller-manager Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:endpoints.kubernetes.io/last-change-trigger-time":{}},"f:labels":{".":{},"f:app":{}}}}} {leader-elector Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:control-plane.alpha.kubernetes.io/leader":{}}},"f:subsets":{}}}]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.34.0.0,TargetRef:&ObjectReference{Kind:Pod,Namespace:default,Name:quartz-1,UID:4728a8f8-9849-48fd-99c7-ce767b88d3f7,APIVersion:,ResourceVersion:402824919,FieldPath:,},Hostname:,NodeName:nil,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:web,Port:8080,Protocol:TCP,AppProtocol:nil,},},},},} 
2023/12/16 00:01:52 [DEBUG] Processing Key: &{default Endpoints quartz-web 0xc000817e00 Update  false}
2023/12/16 00:01:54 [DEBUG] Enqueueing Endpoints: &Endpoints{ObjectMeta:{quartz-web  default  4b9ef866-8c8e-491f-97eb-d0a5aca71534 402877513 0 2022-07-18 15:58:26 +0000 UTC <nil> <nil> map[app:quartz-web] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"quartz-1","leaseDurationSeconds":10,"acquireTime":"2023-12-06T19:00:42Z","renewTime":"2023-12-16T00:01:54Z","leaderTransitions":0} endpoints.kubernetes.io/last-change-trigger-time:2023-12-15T18:39:37Z] [] []  [{kube-controller-manager Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:endpoints.kubernetes.io/last-change-trigger-time":{}},"f:labels":{".":{},"f:app":{}}}}} {leader-elector Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:control-plane.alpha.kubernetes.io/leader":{}}},"f:subsets":{}}}]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.34.0.0,TargetRef:&ObjectReference{Kind:Pod,Namespace:default,Name:quartz-1,UID:4728a8f8-9849-48fd-99c7-ce767b88d3f7,APIVersion:,ResourceVersion:402824919,FieldPath:,},Hostname:,NodeName:nil,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:web,Port:8080,Protocol:TCP,AppProtocol:nil,},},},},} 
2023/12/16 00:01:54 [DEBUG] Processing Key: &{default Endpoints quartz-web 0xc0000048c0 Update  false}
2023/12/16 00:01:57 [DEBUG] Enqueueing Endpoints: &Endpoints{ObjectMeta:{quartz-web  default  4b9ef866-8c8e-491f-97eb-d0a5aca71534 402877519 0 2022-07-18 15:58:26 +0000 UTC <nil> <nil> map[app:quartz-web] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"quartz-1","leaseDurationSeconds":10,"acquireTime":"2023-12-06T19:00:42Z","renewTime":"2023-12-16T00:01:57Z","leaderTransitions":0} endpoints.kubernetes.io/last-change-trigger-time:2023-12-15T18:39:37Z] [] []  [{kube-controller-manager Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:endpoints.kubernetes.io/last-change-trigger-time":{}},"f:labels":{".":{},"f:app":{}}}}} {leader-elector Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:control-plane.alpha.kubernetes.io/leader":{}}},"f:subsets":{}}}]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.34.0.0,TargetRef:&ObjectReference{Kind:Pod,Namespace:default,Name:quartz-1,UID:4728a8f8-9849-48fd-99c7-ce767b88d3f7,APIVersion:,ResourceVersion:402824919,FieldPath:,},Hostname:,NodeName:nil,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:web,Port:8080,Protocol:TCP,AppProtocol:nil,},},},},} 
2023/12/16 00:01:57 [DEBUG] Processing Key: &{default Endpoints quartz-web 0xc000004dc0 Update  false}
2023/12/16 00:01:58 [DEBUG] [2023-12-16 00:01:58,899 __main__ DEBUG] config handler woken for reset
2023/12/16 00:01:58 [DEBUG] [2023-12-16 00:01:58,899 __main__ DEBUG] loaded configuration file successfully
2023/12/16 00:01:58 [DEBUG] [2023-12-16 00:01:58,900 __main__ DEBUG] NET Config: {}
2023/12/16 00:01:58 [DEBUG] [2023-12-16 00:01:58,900 __main__ DEBUG] loaded configuration file successfully
2023/12/16 00:01:58 [DEBUG] [2023-12-16 00:01:58,900 __main__ DEBUG] updating tasks finished, took 0.0008361339569091797 seconds

Observations (if any)

Checking the AS3 sent to the big ip we are seeing this in the pool definition of this server "quartz_web_web_default": {"class":"Pool","minimumMonitors":1,"monitors":[{"use":"/kubernetes-as3/Shared/quartz_web_default_http_web"}]}

There are a few other servers that running correctly as expected. Only this one is having an issue. We also trying specifying the port and target port but that did not work

samgabriel avatar Dec 16 '23 00:12 samgabriel

@samgabriel can you confirm if endpoints are created properly for quartz-web service .

kubectl get ep

lavanya-f5 avatar Dec 20 '23 11:12 lavanya-f5

@lavanya-f5 yes it is working correctly with the bigip controller version 1.8.1 The kubectl get ep returns amongst other endpoints quartz-web 10.34.0.0:8080 528d

samgabriel avatar Dec 28 '23 22:12 samgabriel

@samgabriel Thanks for the confirmation. Unable to reproduce this locally with 2.15 CIS build. Could you please share below information to debug issue further

  1. Please enable --log-as3-response and --log-level to DEBUG and share complete cis log file kubectl logs deploy/ -n kube-system > logs_cis

  2. kubectl get nodes

  3. kubectl edit deploy/ -n kube-system -o yaml

lavanya-f5 avatar Dec 29 '23 05:12 lavanya-f5

@lavanya-f5 I had log-as3-response and log-level to debug that is how i got the snippet from the as3 post to the big ip "quartz_web_web_default": {"class":"Pool","minimumMonitors":1,"monitors":[{"use":"/kubernetes-as3/Shared/quartz_web_default_http_web"}]}

This is only partial from the full request which included 5 other services I can't disclose the full log. The one difference about this VS is that there were no members item on the list. All the other ones had nodes. In terms of configuration the service have externalTrafficPolicy: Local which means it will only run on the node that the pod is running as well as the leader elector.

Also in the logs you can see 2023/12/16 00:01:52 [DEBUG] Enqueueing Endpoints: &Endpoints{ObjectMeta:{quartz-web default 4b9ef866-8c8e-491f-97eb-d0a5aca71534 402877503 0 2022-07-18 15:58:26 +0000 UTC map[app:quartz-web] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"quartz-1","leaseDurationSeconds":10,"acquireTime":"2023-12-06T19:00:42Z","renewTime":"2023-12-16T00:01:52Z","leaderTransitions":0} endpoints.kubernetes.io/last-change-trigger-time:2023-12-15T18:39:37Z] [] [] [{kube-controller-manager Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:endpoints.kubernetes.io/last-change-trigger-time":{}},"f:labels":{".":{},"f:app":{}}}}} {leader-elector Update v1 2023-12-15 18:39:37 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{"f:control-plane.alpha.kubernetes.io/leader":{}}},"f:subsets":{}}}]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.34.0.0,TargetRef:&ObjectReference{Kind:Pod,Namespace:default,Name:quartz-1,UID:4728a8f8-9849-48fd-99c7-ce767b88d3f7,APIVersion:,ResourceVersion:402824919,FieldPath:,},Hostname:,NodeName:nil,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:web,Port:8080,Protocol:TCP,AppProtocol:nil,},},},},}

Which had Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]

In terms of nodes there are 4 worker nodes and all other services and VSes are working fine.

samgabriel avatar Dec 29 '23 05:12 samgabriel

@samgabriel Please share Complete CIS logs and CIS config to automation_toolchain_pm [email protected].

trinaths avatar Dec 29 '23 05:12 trinaths

@samgabriel please the requested info.

trinaths avatar Feb 13 '24 16:02 trinaths

No response from issue author. Closing this issue.

trinaths avatar May 23 '24 17:05 trinaths