traefik icon indicating copy to clipboard operation
traefik copied to clipboard

Add HealthCheck for KubernetesCRD ExternalName services

Open marcmognol opened this issue 1 year ago • 19 comments

What does this PR do?

This PR adds the possibility to configure HealthChecks for external named services.

Motivation

In some cases, we need to be able to configure HealthChecks on ExternalName services configured with CRD. With this PR it will be possible to do the following :

---
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
  name: wrr
spec:
  weighted:
    services:
       - name: external-svc1
         scheme: https
         port: 443
       - name: external-svc2
         scheme: https
         port: 443
         healthCheck:
           hostname: static.ifs.asf.pri
           path: /landing-page
           interval: 15
           followRedirects: false

Big thanks to @StephanSalas for the inspiration.

Closes #8147 Closes #8541

More

  • [x] Added/updated tests
  • [ ] Added/updated documentation

Additional Notes

marcmognol avatar Feb 23 '24 21:02 marcmognol

Hi @emilevauge,

This PR is related to issue #8147 for which you requested help from the community. I would like, if you may, tell me if I am on the right direction with this draft PR ?

Then, if it is ok, I'll add unit tests and documentation.

I tested these modifications and it answers my need.

Thanks for your help.

Marc

marcmognol avatar Feb 23 '24 21:02 marcmognol

Hello,

Do I have to rebase it on v2.11 or can I keep it on master ?

Thanks.

marcmognol avatar Feb 26 '24 10:02 marcmognol

Hey @marcmognol,

No need to rebase the PR (it's an enhancement). We mark it as design-review to allow the team to check your PR design before moving forward on the review.

nmengin avatar Feb 26 '24 13:02 nmengin

Hello @marcmognol and thanks for this contribution,

Instead of allowing the user to define the server URLs in a TraefikService, we would prefer to leverage K8s ExternalName services and provide a property allowing the user to enable healtchecks for ExternalName services.

As explained in https://github.com/traefik/traefik/issues/8541#issuecomment-1081014912, I think you can get some inspiration from #8663. If that is ok for you, could you please also give some credits to @StephanSalas in the pull request description?

kevinpollet avatar Mar 05 '24 14:03 kevinpollet

Hello @kevinpollet

Thank you for your reply.

I choose to do this way, because I thought it is easier to configure with the CRD only.

So, I followed your advice, and thanks to the PR of @StephanSalas I was able to add some changes to this PR.

Could you please tell me if I am in the right direction before going further ? Editing the title of this PR and add reference to the previous PR of @StephanSalas.

So, with this method, I can define Health Check on the service level. My concerns is that when I define 2 services for a single Ingress Route, Traefik will create 3 services, 2 of type Load Balancer and one of type Weighted.

Is there any performance downside compared to a single service of type Load Balancer? What is the difference between this configuration and a single Load Balancer?

Thanks for your answers.

Marc

marcmognol avatar Mar 06 '24 21:03 marcmognol

Hello @kevinpollet

I can achieve having single LoadBalancer with CRD when I use static Endpoints object for a Service. But I can only use IP address, I am not able to use FQDN of external services on the Endpoint.

That's why I think addind CRD LoadBalancer type can be a great idea.

Please let me know if we could also implement CRD for LoadBalancer?

Thanks

marcmognol avatar Mar 13 '24 07:03 marcmognol

Hello @marcmognol and sorry for the late answer,

Is there any performance downside compared to a single service of type Load Balancer? What is the difference between this configuration and a single Load Balancer?

There is not a real performance downside. This is because, in the IngressRoute configuration, it is possible to configure a weight for the service, which implies using a WRR. This is not true anymore since https://github.com/traefik/traefik/pull/10372, but this should not be changed in this PR.

I can achieve having single LoadBalancer with CRD when I use static Endpoints object for a Service. But I can only use IP address, I am not able to use FQDN of external services on the Endpoint.

Here I am not sure to follow can you elaborate?

Regarding the code, here https://github.com/traefik/traefik/pull/10467/files#diff-7d5c919304c65587ab6f978e73008d6d268e9bbb6db829eadf8d72f4f492e28fR306 the health check has to be set up only for external name services. For other services, we want to rely only on K8s healtcheck, did I miss something?

kevinpollet avatar Mar 25 '24 14:03 kevinpollet

Hello @kevinpollet,

Thanks for your reply. It is good to know there is no performance downside regarding wrr and it seeams that it is something I could use in my situation. I also need to pay attention that each request going through this ingressroute is transferred to the same service, so I need to use sticky cookie, but AFAIK the sticky cookie is set on the WRR level not the loabalancer itself.

In fact I would need to convert the following file configuration into CRDs:

  services:
    uat-service:
      loadBalancer:
        serversTransport: uat-transport
        servers:
          - url: https://localserver-1.asf.pri:443
          - url: https://localserver-2.asf.pri:443
        healthCheck:
          hostname: uat.local.pri
          path: /landing-page/
          interval: 15s
        sticky:
          cookie:
            secure: true
            httpOnly: true

Where keeping sticky cookie and health checks are mandatory. That's why this PR implements the creation of CRD for LoadBalancer which is for my point of view the best solution. But, feel free to tell what would be the best solution.

Thanks for your help on this.

marcmognol avatar Mar 25 '24 23:03 marcmognol

Hello @marcmognol,

I think there is a misunderstanding, https://github.com/traefik/traefik/issues/8147 and https://github.com/traefik/traefik/issues/8541 are about adding the health check feature for ExternalName services. For other K8s service types, we want to use the orchestrator health check which adds or removes endpoints dynamically to the corresponding services.

If I am not missing something by using ExternalName services it is already possible to do what you want, we are just missing the healtcheck feature (please note that the following option has to be enabled).

Something like:

---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: foo
spec:
  routes:
    - match: Host(`foo.com`)
      kind: Rule
      services:
        - name: wrr
          kind: TraefikService

---
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
  name: wrr
spec:
  weighted:
    services:
       - name: external-svc1
         scheme: https
         port: 443
       - name: external-svc2
         scheme: https
         port: 443

---
apiVersion: v1
kind: Service
metadata:
  name: external-svc1
spec:
  externalName: localserver-1.asf.pri
  type: ExternalName

---
apiVersion: v1
kind: Service
metadata:
  name: external-svc2
spec:
  externalName: localserver-2.asf.pri
  type: ExternalName

Does it make sense?

kevinpollet avatar Mar 26 '24 13:03 kevinpollet

Hello @kevinpollet

Yes it makes sense thank you. What about the sticky cookie which is required for me to ensure each request is routed to right service ?

If it is possible, then I'll update actual PR to add HealthCheck feature.

Marc

marcmognol avatar Mar 26 '24 16:03 marcmognol

Hello @marcmognol,

Yes it makes sense thank you. What about the sticky cookie which is required for me to ensure each request is routed to right service ?

I missed to add the sticky configuration in the example, as explained in https://doc.traefik.io/traefik/routing/providers/kubernetes-crd/#stickiness-and-load-balancing it is possible to configure the stickiness at the WRR level.

kevinpollet avatar Mar 27 '24 08:03 kevinpollet

Hello @kevinpollet Thanks for everything. I've updated the PR. Could you please have a look ?

Thanks.

marcmognol avatar Mar 28 '24 18:03 marcmognol

Hello @marcmognol, did I miss something, the code is still adding the health check configuration for all services? (https://github.com/traefik/traefik/pull/10467/files#diff-7d5c919304c65587ab6f978e73008d6d268e9bbb6db829eadf8d72f4f492e28fR306)

kevinpollet avatar Mar 29 '24 08:03 kevinpollet

Hello @kevinpollet You're right, I've added a check (+ unit test) when service type is not ExternalName and HealthCheck is defined. Am I in the right direction ?

Thanks

marcmognol avatar Mar 31 '24 19:03 marcmognol

@kevinpollet I rebased my branch and now tests are failing. Could you please tell me if I am in the right direction before fixing my code to fit new unit tests ?

Thanks

marcmognol avatar Apr 07 '24 16:04 marcmognol

Hello @marcmognol,

Sorry for the late answer, you are in the right direction, I added some suggestions. Thanks for the work.

kevinpollet avatar Apr 22 '24 14:04 kevinpollet

Hi @kevinpollet

Thanks for your feedback. Feel free to tell me if I can do more for this PR.

Marc

marcmognol avatar Apr 22 '24 18:04 marcmognol

Hi @marcmognol,

Last thing, could you add some documentation in the following section https://doc.traefik.io/traefik/routing/providers/kubernetes-crd/#kind-ingressroute?

Could you also rebase the pull request to fix the CI?

kevinpollet avatar Apr 23 '24 07:04 kevinpollet

@kevinpollet

Thanks for your comment. I've rebased the PR, but during this process I've made some mistakes, so I had to force push... Sorry. I also added some documentation as requested.

Marc

marcmognol avatar Apr 23 '24 12:04 marcmognol