serving icon indicating copy to clipboard operation
serving copied to clipboard

DomainMapping to Kubernetes Service stuck in "Uninitialized"

Open xtaje opened this issue 3 years ago • 3 comments

What version of Knative?

1.0

Expected Behavior

DomainMapping with a Kubernetes Service should become ready.

Actual Behavior

I am trying to use a DomainMapping with a regular Kubernetes Service, as described in the Tips in the docs. The custom host is "my-test.biz".

After creation, the DomainMapping remains in Ready state "Unknown" with Reason "Uninitialized".

The DomainMapping does seem to work OK, because I can hit the Kourier ingress gateway with a Host header to "mytest-biz" and get a 200 back from the service, but the status of the DomainMapping never becomes ready. This seems to be caused by the Kourier probe.

The DomainMapping conditions show this status:

Status:
  Address:
    URL:  http://hotcakes-test.biz
  Conditions:
    Last Transition Time:  2022-07-25T18:32:53Z
    Message:               autoTLS is not enabled
    Reason:                TLSNotEnabled
    Status:                True
    Type:                  CertificateProvisioned
    Last Transition Time:  2022-07-25T18:32:53Z
    Status:                True
    Type:                  DomainClaimed
    Last Transition Time:  2022-07-25T18:32:53Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  IngressReady
    Last Transition Time:  2022-07-25T18:32:53Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  Ready
    Last Transition Time:  2022-07-25T18:32:53Z
    Status:                True
    Type:                  ReferenceResolved

Inspecting the logs of the kourier-control pod shows messages of this form:

{"level":"error","ts":"2022-07-27T04:57:55.987+0800","logger":"kourier.status-manager","caller":"status/status.go:399","msg":"Probing of http://mytest-test.biz/ failed, IP: 172.21.129.123:8080, ready: false, error: unexpected status code: want 200, got 404 (depth: 0)","knative.dev/key":"backend/mytest-test.biz","stacktrace":"knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:399\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:289"}
{"level":"error","ts":"2022-07-27T04:57:55.987+0800","logger":"kourier.status-manager","caller":"status/status.go:399","msg":"Probing of http://mytest-test.biz/ failed, IP: 172.21.129.123:8080, ready: false, error: unexpected status code: want 200, got 404 (depth: 0)","knative.dev/key":"backend/mytest-test.biz","stacktrace":"knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:399\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:289"}
{"level":"error","ts":"2022-07-27T04:57:55.988+0800","logger":"kourier.status-manager","caller":"status/status.go:399","msg":"Probing of http://mytest-test.biz/ failed, IP: 172.21.160.89:8080, ready: false, error: unexpected status code: want 200, got 404 (depth: 0)","knative.dev/key":"backend/mytest-test.biz","stacktrace":"knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:399\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:289"}
{"level":"error","ts":"2022-07-27T04:57:55.988+0800","logger":"kourier.status-manager","caller":"status/status.go:399","msg":"Probing of http://mytest-test.biz/ failed, IP: 172.21.160.89:8080, ready: false, error: unexpected status code: want 200, got 404 (depth: 0)","knative.dev/key":"backend/mytest-test.biz","stacktrace":"knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:399\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\t/go/pkg/mod/knative.dev/[email protected]/pkg/status/status.go:289"}

where those IPs are for the two kourier gateways. If I grab a shell into the kourier-control container, I am able to curl those two IPs if I add a Host: mytest.biz header.

The cluster is using Kourier and Istio Service Mesh.

Steps to Reproduce the Problem

Create a normal k8s service. Create a DomainMapping with a custom hostname. The DomainMapping will be stuck in "Uninitialized."

xtaje avatar Jul 26 '22 20:07 xtaje

/area networking /triage accepted

I confirmed this using the following setup

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app.kubernetes.io/name: proxy
spec:
  containers:
  - name: nginx
    image: nginx:stable
    ports:
      - containerPort: 80
        name: http-web-svc
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app.kubernetes.io/name: proxy
  ports:
  - name: name-of-service-port
    protocol: TCP
    port: 80
    targetPort: http-web-svc
---
apiVersion: serving.knative.dev/v1alpha1
kind: DomainMapping
metadata:
  name: nginx.com
spec:
  ref:
    name: nginx-service
    kind: Service
    apiVersion: v1

This seems to work fine for Contour but not for Kourier (cc @nak3) & Istio (cc @AngeloDanducci)

dprotaso avatar Jul 29 '22 15:07 dprotaso

I am able to make this work in our Knative installation using Istio Ingress. Unfortunately we have two different clusters with customized installations, so I couldn't tell you what the difference is.

xtaje avatar Jul 29 '22 19:07 xtaje

How are you customizing the istio installation?

If you're using istioctl and a config like this https://github.com/knative-sandbox/net-istio/blob/main/third_party/istio-latest/istio-ci-no-mesh.yaml that could provide some insights

dprotaso avatar Jul 29 '22 19:07 dprotaso

Hi, can I work on this issue?

Gekko0114 avatar Jun 27 '23 14:06 Gekko0114

/assign @Gekko0114

Of course! Thank you so much @Gekko0114

nak3 avatar Jun 28 '23 00:06 nak3

Hi @dprotaso,

I have two questions. Could you answer my questions when you have time?

Q1: I am using DomainMapping and Kubernetes service on Kind. Is the following YAML correct?

apiVersion: networking.internal.knative.dev/v1alpha1
kind: ClusterDomainClaim
metadata:
  name: test.biz
spec:
  namespace: default
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app.kubernetes.io/name: proxy
spec:
  containers:
  - name: nginx
    image: nginx:stable
    ports:
      - containerPort: 80
        name: http-web-svc
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app.kubernetes.io/name: proxy
  ports:
  - name: name-of-service-port
    protocol: TCP
    port: 80
    targetPort: http-web-svc
---
apiVersion: serving.knative.dev/v1beta1
kind: DomainMapping
metadata:
  name: test.biz
  namespace: default
spec:
  ref:
    name: nginx-service
    kind: Service
    apiVersion: v1

Q2: Are there any documents how to use Kubernetes service for DomainMapping? I couldn't find it even after searching the repository.

BTW, I have been testing it using Kind, and I was able to confirm the same issue. However, the error with 'net-kourier-controller' was not 404, but 503.

{"severity":"INFO","timestamp":"2023-07-01T13:32:23.099294555Z","logger":"net-kourier-controller","caller":"status/status.go:362","message":"Processing probe for http://test.biz/, IP: 10.244.0.4:8090 (depth: 1)","commit":"790358f-dirty","knative.dev/controller":"knative.dev.net-kourier.pkg.

Gekko0114 avatar Jul 01 '23 13:07 Gekko0114

I have identified the cause of this issue. There are two problems.

  1. There is an issue with the Endpoint to the Kubernetes service (at least in my environment). In the current implementation, the targetPort is being used for the port number of the Kubernetes service, but it should be using the externalPort. here
  2. There are cases where requests cannot be sent to /healthz. When creating a Kubernetes pod, it does not have an envoy sidecar. As a result, sometimes requests cannot be sent to healthz, causing issues with domainMapping. here

The first problem can be easily fixed. However, I am not sure how I should address the second problem. There are several possible solutions, but which one is best for the Knative community? These are my ideas.

  • If Kourier is available, force the creation of an envoy sidecar when setting up the pod.
  • Document the considerations for using domainMapping with the Kubernetes service as the target. No changes to the code.
  • Change the health check path for the Kubernetes pod from "/healthz" to "/" when using kubernetes service as the target.

@dprotaso @nak3 @skonto wdyt?

Gekko0114 avatar Jul 10 '23 13:07 Gekko0114

Sorry I still have a good idea but leave a few comments.

There is an issue with the Endpoint to the Kubernetes service (at least in my environment). In the current implementation, the targetPort is being used for the port number of the Kubernetes service, but it should be using the externalPort. here

This change will break KService, won't it?

Change the health check path for the Kubernetes pod from "/healthz" to "/" when using kubernetes service as the target.

This must assume that users's k8s pod returns 200 from /, right? If so, it won't work when user's app disallows /.

I still don't have a good idea and so wanted to check how Contour works, but @Gekko0114's example https://github.com/knative/serving/issues/13158#issuecomment-1615926674 has produced the same issue on Contour...

nak3 avatar Jul 11 '23 01:07 nak3

Hi @nak3, Thanks for your comment.

This change will break KService, won't it?

It might break KService, so I will try not to break KService.

This must assume that users's k8s pod returns 200 from /, right? If so, it won't work when user's app disallows /.

You're right. It might not be a good solution.

I still don't have a good idea and so wanted to check how Contour works, but @Gekko0114's example https://github.com/knative/serving/issues/13158#issuecomment-1615926674 has produced the same issue on Contour...

I am not sure how Contour works. BTW, I am not sure domainMapping should work fine with kubernetes service.

Gekko0114 avatar Jul 11 '23 12:07 Gekko0114