kubernetes-ingress icon indicating copy to clipboard operation
kubernetes-ingress copied to clipboard

Backend Servers not created/updated since v.1.7.11

Open dschuldt opened this issue 2 years ago • 20 comments

Hi,

since the 1.7.11 update (happens with 1.8.0 as well) we experience the following issue (K8s version 1.22):

If an ingress is configured with a path type exact rule, the resulting backend is started without servers.

Example:

- path: /actuator/health/readiness
  pathType: Exact
  backend:
    service:
      name: api-gateway-master
      port:
        name: management

Screenshot 2022-06-14 112240

With version 1.7.10 (server slots scaled to "5"):

Screenshot 2022-06-14 112343

After some time/ some reloads of services and ingress controller, the servers are applied. I cannot tell which action actually helps. But if we update the ingress or service resource, e.g. rename a port, the change is not updated to haproxy config, although the log clearly says an update is necessary and performed.

This is the spec of the corresponding service:

...
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  - name: management
    port: 9443
    protocol: TCP
    targetPort: management
  selector:
    app: api-gateway
    branch: master
```

If I can assist with this issue please let me know. 

Thanks a lot!

Best
Denis

dschuldt avatar Jun 14 '22 09:06 dschuldt

Hi, @dschuldt.

I could not reproduce the bug. I debugged the code involved in creating backends and could find no apparent flaws.

Can you post the values.yaml and the other annotations of the problematic Service?

fabianonunes avatar Jun 15 '22 14:06 fabianonunes

Hi @fabianonunes,

we do not deploy the controller with helm but with the plain manifests in the repo.

The controller runs with the following args:

args:
  - --configmap=haproxy-controller/haproxy-kubernetes-ingress
  - --default-backend-service=haproxy-controller/haproxy-kubernetes-ingress-default-backend
  - --configmap-tcp-services=haproxy-controller/tcpservices
  - --default-ssl-certificate=haproxy-controller/wildcard-cert
  - --publish-service=haproxy-controller/haproxy-kubernetes-ingress-lb
  - --disable-ipv6

The configmap spec:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: haproxy-kubernetes-ingress
  namespace: haproxy-controller
data:
  syslog-server: "address:stdout, format: raw, facility:daemon"
  forwarded-for: "false"
  request-capture: |
    hdr(Host)
    hdr(X-Forwarded-For)
  timeout-client: 60s
  timeout-server: 60s
  frontend-config-snippet: |
    option forwardfor if-none
  stats-config-snippet: |
    http-request set-log-level silent
  scale-server-slots: "5"

The full service spec (manifest templated with ansible before deployment):

---
apiVersion: v1
kind: Service
metadata:
  name: {{ extra_vars_service_name }}
  namespace: {{ extra_vars_namespace }}
  labels:
    app: {{ extra_vars_app_name }}
    version: "{{ extra_vars_tag }}"
    springBootAdmin: enabled
spec:
  selector:
    app: {{ extra_vars_app_name }}
    branch: "{{ extra_vars_branch }}"
  ports:
    - name: https
      port: 443
      targetPort: https
    - name: management
      port: 9443
      targetPort: management

The full Ingress spec:

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ extra_vars_deployment_name }}
  namespace: {{ extra_vars_namespace }}
  labels:
    app_name: {{ extra_vars_app_name }}
    version: "{{ extra_vars_tag }}"
  annotations:
    haproxy.com/server-ssl: "true"
spec:
  tls:
  - hosts:
      - {{ extra_vars_host_url }}
    secretName: wildcard-cert
  rules:
  - host: {{ extra_vars_host_url }}
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: {{ extra_vars_service_name }}
            port:
              name: https
      - path: /actuator/health/readiness
        pathType: Exact
        backend:
          service:
            name: {{ extra_vars_service_name }}
            port:
              name: management

Thanks for your help.

dschuldt avatar Jun 15 '22 20:06 dschuldt

Hi @dschuldt , it seems kubernets doesn't like your naming of ports in your service. Can you confirm that it works if you change the targetPort to point to a port value ? On my side, I see no ports on corresponding endpoints if using same naming as yours.

ivanmatmati avatar Jun 16 '22 08:06 ivanmatmati

Hi @ivanmatmati,

it does not work with other names or port numbers instead of names either.

I also tried with pathType: Prefix for /actuator/health/readiness... it does not work either. So the title of my issue is a bit misleading :) (will edit after this comment).

This morning I performed this debugging session:

  1. Run controller with --log=trace
  2. Renamed the port in service from "management" to "actuator".

Here are the logs for v1.7.10 (working) and 1.7.11 (not working). I stripped not related entries for better readability. Please ignore the changed api-gateway names since this deployment comes from a different branch.

1.7.10
------

2022/06/17 07:00:12 TRACE   controller.go:144 HAProxy config sync started
2022/06/17 07:00:12 TRACE   ingress/ingress.go:137 Processing Ingress annotations in ConfigMap
2022/06/17 07:00:12 TRACE   ingress/ingress.go:196 Ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing secrets...
2022/06/17 07:00:12 TRACE   ingress/ingress.go:211 Ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing annotations...
2022/06/17 07:00:12 TRACE   ingress/ingress.go:221 ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing rules...
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_3'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_4'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_5'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_6'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_7'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_8'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_9'
2022/06/17 07:00:12 TRACE   service/endpoints.go:77 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_10'
2022/06/17 07:00:12 DEBUG   service/service.go:130 Service 'api-gateway/api-gateway-feature-linkerd-setup': new backend 'api-gateway-api-gateway-feature-linkerd-setup-actuator', reload required
Reload required
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_1'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_2'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_3'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_4'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_5'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_6'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_7'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_8'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_9'
2022/06/17 07:00:12 TRACE   service/endpoints.go:82 Creating server 'api-gateway-api-gateway-feature-linkerd-setup-actuator/SRV_10'
2022/06/17 07:00:12 DEBUG   maps/main.go:124 Map file 'path-exact' updated, reload required
2022/06/17 07:00:12 DEBUG   handler/refresh.go:57 Deleting backend 'api-gateway-api-gateway-feature-linkerd-setup-management'
2022/06/17 07:00:12 INFO    controller.go:213 HAProxy reloaded
2022/06/17 07:00:12 TRACE   controller.go:219 HAProxy config sync ended
[WARNING]  (257) : Reexecuting Master process
[WARNING]  (257) : config: Can't get version of the global server state file '/var/state/haproxy/global'.										 
Proxy healthz stopped (cumulated conns: FE: 2, BE: 0).
Proxy http stopped (cumulated conns: FE: 0, BE: 0).
Proxy https stopped (cumulated conns: FE: 1, BE: 0).
[WARNING]  (284) : Proxy healthz stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (284) : Proxy http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (284) : Proxy https stopped (cumulated conns: FE: 1, BE: 0).
[WARNING]  (284) : Proxy stats stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (284) : Stopping frontend GLOBAL in 0 ms.
[WARNING]  (284) : Stopping backend api-gateway-api-gateway-feature-linkerd-setup-http in 0 ms.
[WARNING]  (284) : Stopping backend api-gateway-api-gateway-feature-linkerd-setup-management in 0 ms.
Proxy stats stopped (cumulated conns: FE: 2, BE: 0).
Stopping backend api-gateway-api-gateway-feature-linkerd-setup-http in 0 ms.
Stopping backend api-gateway-api-gateway-feature-linkerd-setup-management in 0 ms.
[NOTICE]   (257) : New worker #1 (294) forked
[WARNING]  (257) : Former worker #1 (284) exited with code 0 (Exit)
1.7.11
------

2022/06/17 07:04:26 TRACE   controller.go:142 HAProxy config sync started
2022/06/17 07:04:26 TRACE   ingress/ingress.go:145 Processing Ingress annotations in ConfigMap
2022/06/17 07:04:26 TRACE   ingress/ingress.go:205 Ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing secrets...
2022/06/17 07:04:26 TRACE   ingress/ingress.go:224 Ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing annotations...
2022/06/17 07:04:26 TRACE   ingress/ingress.go:235 ingress 'api-gateway/api-gateway-feature-linkerd-setup': processing rules...
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_3'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_4'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_5'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_6'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_7'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_8'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_9'
2022/06/17 07:04:26 TRACE   service/endpoints.go:80 Updating server 'api-gateway-api-gateway-feature-linkerd-setup-http/SRV_10'
2022/06/17 07:04:26 DEBUG   service/service.go:130 Service 'api-gateway/api-gateway-feature-linkerd-setup': new backend 'api-gateway-api-gateway-feature-linkerd-setup-actuator', reload required
Reload required																											   
2022/06/17 07:04:26 DEBUG   maps/main.go:124 Map file 'path-exact' updated, reload required
2022/06/17 07:04:26 DEBUG   handler/refresh.go:57 Deleting backend 'api-gateway-api-gateway-feature-linkerd-setup-management'
2022/06/17 07:04:26 INFO    controller.go:212 HAProxy reloaded
2022/06/17 07:04:26 TRACE   controller.go:218 HAProxy config sync ended
[WARNING]  (256) : Reexecuting Master process
[WARNING]  (256) : config: Can't get version of the global server state file '/var/state/haproxy/global'.
[WARNING]  (269) : Proxy healthz stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (269) : Proxy http stopped (cumulated conns: FE: 0, BE: 0).
[WARNING]  (269) : Proxy https stopped (cumulated conns: FE: 1, BE: 0).
[WARNING]  (269) : Proxy stats stopped (cumulated conns: FE: 2, BE: 0).
[WARNING]  (269) : Stopping frontend GLOBAL in 0 ms.
[WARNING]  (269) : Stopping backend api-gateway-api-gateway-feature-linkerd-setup-http in 0 ms.
[WARNING]  (269) : Stopping backend api-gateway-api-gateway-feature-linkerd-setup-management in 0 ms.
[NOTICE]   (256) : New worker #1 (279) forked
Proxy healthz stopped (cumulated conns: FE: 2, BE: 0).
Proxy http stopped (cumulated conns: FE: 0, BE: 0).
Proxy https stopped (cumulated conns: FE: 1, BE: 0).																						 
Proxy stats stopped (cumulated conns: FE: 2, BE: 0).
Stopping backend api-gateway-api-gateway-feature-linkerd-setup-http in 0 ms.
Stopping backend api-gateway-api-gateway-feature-linkerd-setup-management in 0 ms.										 
[WARNING]  (256) : Former worker #1 (269) exited with code 0 (Exit)

Please note the missing "Creating server..." lines in v1.7.11 😮

And here it comes: If I remove a path from the ingress spec, so that only one path is present, it works!

I am not sure whats happening, but I guess it has to do with changes ingress update logic in 1.7.11 release.

If I can assist/test any further, please let me know. Thanks.

dschuldt avatar Jun 17 '22 07:06 dschuldt

Hi, @dschuldt , I've fixed the bug. the fix release will come soon. Thanks.

ivanmatmati avatar Jun 17 '22 12:06 ivanmatmati

Awesome, thanks. Already tested the port rename case successfully.

Have a great weekend.

Best

dschuldt avatar Jun 18 '22 06:06 dschuldt

I'm also seeing this - almost this - as it was with an Ingress object that do not have several Paths set. And we're on v1.8.3 and are using Helm to deploy to a K3s v1.22.6 cluster.

When is the fix coming Sir @ivanmatmati ? Thank you.

LarsBingBong avatar Jul 01 '22 11:07 LarsBingBong

@LarsBingBong,

the fix landed in v.1.8.2 with this commit.

Do you still experience my mentioned issue?

dschuldt avatar Jul 01 '22 12:07 dschuldt

Oh yeah @dschuldt I do ... As I write I'm on v1.8.3 of HAProxy and still see issues with backend conf. not being updated-

This specific situation was when an Ingress obj. with auth-basic and auth-secret annotations on it was created. HAProxy picked it up >> assigned it's IP to it and >> the auth. part was written to haproxy.cfg - seen when shelling into X HAProxy Pod of the HAProxy ingress controller workload. However, the backend for said Ingress wasn't created in HAProxy.

Thanks

LarsBingBong avatar Jul 01 '22 17:07 LarsBingBong

@ivanmatmati is there still something here to be cleaned up. We're on latest and see this still. It might be connected to specific annotations on the Ingress as I write about in my last comment.

LarsBingBong avatar Jul 03 '22 19:07 LarsBingBong

@LarsBingBong, I have not been in office the last days but our developers told me they had to restart the IC several times due to missing servers in haproxy config.

@ivanmatmati I guess there are still some issues.

Will try to debug such ingress specifications in question.

Best Denis

dschuldt avatar Jul 06 '22 07:07 dschuldt

Here is a new observation from today.

I changed our haproxy-kubernetes-ingress-default-backend service definition from

...
annotations:
  haproxy.com/backend-config-snippet: |
    http-request deny deny_status 403
...

to

...
annotations:
  haproxy.com/backend-config-snippet: |
    http-request return status 404 content-type text/plain string "HAProxy IC 404"
...

expecting that requests to a cluster, which do not match a configured hostname result in a 404 instead of a previously configured 403.

This works instantly on a newly created controller (deleted pod or restarted deployment).

On two clusters, where the controller runs for a longer time (like 6 days ago), the change is not picked up and not written to the haproxy.cfg file.

@ivanmatmati is it possible, that the IC looses connection to the api-server/other involved k8s component?

@LarsBingBong are you able to check that behaviour on your site?

dschuldt avatar Jul 06 '22 13:07 dschuldt

Hi @dschuldt , I need some details. On the two clusters where it didn't work, did it work afterwards when controller pod was restarted or not ? I'm not sure if you mean it worked when pod is recreated on another cluster. Beware that a faulty config-snippet will not be reflected in configuration.

ivanmatmati avatar Jul 07 '22 08:07 ivanmatmati

@ivanmatmati, yes it did work afterwards when the IC deployment was restarted. The backend config is valid, it was then written to haproxy.cfg and worked as expected. Sorry for confusion...

dschuldt avatar Jul 07 '22 10:07 dschuldt

@ivanmatmati,

I now have a controller deployment with a single pod older than 4 days (no changes have been applied to the cluster during the weekend). Here is what I can observe:

  • newly created ingresses configurations are applied.
  • modifications to old services, like changing selectors or modifying the haproxy.com/backend-config-snippet annotation are ignored.

Here is a log line from a service modification event, which is picked up by the IC:

2022/07/11 09:00:41 DEBUG   service/service.go:150 Service 'workshop-nginx/nginx-v1-clusterip': backend 'workshop-nginx_nginx-v1-clusterip_80' config-snippet updated: [<nil slice> != [http-request return status 404 content-type text/plain string "Test"]]

If I modify an "old" service, I see no such log...

I have the controller running in trace logging mode. Do you have any idea on how to debug it further?...

Best

dschuldt avatar Jul 11 '22 09:07 dschuldt

Hi @ivanmatmati,

now I have something:

2022/07/18 09:12:19 ERROR   ingress/ingress.go:245 Ingress 'spot-e2etest-login-service-12337/wiremock': service 'spot-e2etest-login-service-12337/wiremock' does not exist
2022/07/18 09:12:19 ERROR   ingress/ingress.go:245 Ingress 'spot-e2etest-login-service-12337/spot-login': service 'spot-e2etest-login-service-12337/spot-login' does not exist
2022/07/18 09:12:19 ERROR   ingress/ingress.go:245 Ingress 'spot-e2etest-login-service-12337/spot-auth': service 'spot-e2etest-login-service-12337/spot-auth' does not exist

But the referenced services in the namespace exist! And like the other days, a controller restart fixes the issue.

I wonder if this has something to with ingressClass logic. We do not set any in Ingress Objects. And no controller configuration is done...

Best D.

dschuldt avatar Jul 18 '22 09:07 dschuldt

Hi @dschuldt , thanks for the update. I'll test it as soon as possible.

ivanmatmati avatar Jul 18 '22 12:07 ivanmatmati

We now use the Helm chart and set a proper ingressclass. Still the issue occurs.

I now believe it's related to store.go (lines 186-189)

svc, svcOk := ns.Services[name]
	if !svcOk {
		return nil, ErrNotFound(fmt.Errorf("service '%s/%s' does not exist", namespace, name))
	}

dschuldt avatar Jul 26 '22 11:07 dschuldt

I wonder if this is related to https://github.com/haproxytech/kubernetes-ingress/issues/312

dschuldt avatar Jul 29 '22 17:07 dschuldt

Here is some interesting information I could gather today. In a cluster without frequent deployments/config changes, this is the RAM consumption before and after a restart. The IC was running for 19 days.

Screenshot 2022-08-09 115615

Please ignore the Panel caption.

dschuldt avatar Aug 09 '22 10:08 dschuldt

Hi @dschuldt , Thanks. I couldn't dedicate much time on this lately but for better tracking can you move your issue in the new one created by an other user ? This is this one.

ivanmatmati avatar Aug 18 '22 09:08 ivanmatmati