serving icon indicating copy to clipboard operation
serving copied to clipboard

ExternalName service or knative-serving:controller issue

Open tokarev-artem opened this issue 1 year ago • 8 comments

Having 2 setups, on minikube and baremetal on minikube exactly the same setup works perfectly, in the real cluster having issues, described below:

In what area(s)?

/area networking

What version of Knative?

eventing: "v1.11.7" serving: "v1.11.5" kourier: "v1.11.5"

Expected Behavior

kubectl get svc 
NAME                          TYPE           CLUSTER-IP      EXTERNAL-IP                                         PORT(S)                                              AGE
event-display                 ExternalName   <none>          kourier-internal.kourier-system.svc.cluster.local   80/TCP                                               2d21h

After deploying https://github.com/knative-extensions/eventing-ceph/tree/main/samples event-display - I expect to get ExternalName to kourier-internal service in kourier-system namespace

Actual Behavior

kubectl get svc -n knative-testing 
NAME                          TYPE           CLUSTER-IP      EXTERNAL-IP                                                        PORT(S)                                              AGE
event-display                 ExternalName   <none>          event-display.knative-testing.svc.k8s.olt.stage.sbd.corproot.net   80/TCP                                               2d16h

The service ExternalName points on itself

Logs

knative-serving - controller

{
    "severity": "debug",
    "timestamp": "2023-12-01T16:14:55.342Z",
    "logger": "controller",
    "caller": "service/reconciler.go:333",
    "message": "Updating status with:   v1.ServiceStatus{\n  \tStatus: v1.Status{\n  \t\tObservedGeneration: 1,\n  \t\tConditions: v1.Conditions{\n  \t\t\t{Type: \"ConfigurationsReady\", Status: \"True\", LastTransitionTime: {Inner: {Time: s\"2023-12-01 16:14:55 +0000 UTC\"}}},\n  \t\t\t{\n  \t\t\t\tType:               \"Ready\",\n  \t\t\t\tStatus:             \"Unknown\",\n  \t\t\t\tSeverity:           \"\",\n- \t\t\t\tLastTransitionTime: apis.VolatileTime{Inner: v1.Time{Time: s\"2023-12-01 16:14:55 +0000 UTC\"}},\n+ \t\t\t\tLastTransitionTime: apis.VolatileTime{Inner: v1.Time{Time: s\"2023-12-01 16:14:55.342468445 +0000 UTC m=+76.564441018\"}},\n- \t\t\t\tReason:             \"IngressNotConfigured\",\n+ \t\t\t\tReason:             \"Uninitialized\",\n  \t\t\t\tMessage: strings.Join({\n- \t\t\t\t\t\"Ingress has not yet been reconciled.\",\n+ \t\t\t\t\t\"Waiting for load balancer to be ready\",\n  \t\t\t\t}, \"\"),\n  \t\t\t},\n  \t\t\t{\n  \t\t\t\tType:               \"RoutesReady\",\n  \t\t\t\tStatus:             \"Unknown\",\n  \t\t\t\tSeverity:           \"\",\n- \t\t\t\tLastTransitionTime: apis.VolatileTime{Inner: v1.Time{Time: s\"2023-12-01 16:14:55 +0000 UTC\"}},\n+ \t\t\t\tLastTransitionTime: apis.VolatileTime{Inner: v1.Time{Time: s\"2023-12-01 16:14:55.342468445 +0000 UTC m=+76.564441018\"}},\n- \t\t\t\tReason:             \"IngressNotConfigured\",\n+ \t\t\t\tReason:             \"Uninitialized\",\n  \t\t\t\tMessage: strings.Join({\n- \t\t\t\t\t\"Ingress has not yet been reconciled.\",\n+ \t\t\t\t\t\"Waiting for load balancer to be ready\",\n  \t\t\t\t}, \"\"),\n  \t\t\t},\n  \t\t},\n  \t\tAnnotations: nil,\n  \t},\n  \tConfigurationStatusFields: {LatestReadyRevisionName: \"event-display-00001\", LatestCreatedRevisionName: \"event-display-00001\"},\n  \tRouteStatusFields:         {URL: &{Scheme: \"http\", Host: \"event-display.knative-testing.svc.k8s.sbd.example.com\"}, Address: &{URL: &{Scheme: \"http\", Host: \"event-display.knative-testing.svc.k8s.sbd.example.com\"}}, Traffic: {{RevisionName: \"event-display-00001\", LatestRevision: &true, Percent: &100}}},\n  }\n",
    "commit": "dca40e4",
    "knative.dev/pod": "controller-9649cdd58-z9wwk",
    "knative.dev/controller": "knative.dev.serving.pkg.reconciler.service.Reconciler",
    "knative.dev/kind": "serving.knative.dev.Service",
    "knative.dev/traceid": "1a0f62bf-9be4-4590-9212-bb81360a8dd1",
    "knative.dev/key": "knative-testing/event-display",
    "targetMethod": "ReconcileKind"
}
******** 
{
    "severity": "warn",
    "timestamp": "2023-12-04T04:13:39.122Z",
    "logger": "controller",
    "caller": "route/reconcile_resources.go:230",
    "message": "Failed to update k8s service",
    "commit": "dca40e4",
    "knative.dev/pod": "controller-9649cdd58-z9wwk",
    "knative.dev/controller": "knative.dev.serving.pkg.reconciler.route.Reconciler",
    "knative.dev/kind": "serving.knative.dev.Route",
    "knative.dev/traceid": "e95c135c-e1a3-4c11-93ef-906c38434c87",
    "knative.dev/key": "knative-testing/event-display",
    "error": "failed to fetch loadbalancer domain/IP from ingress status"
}


describe kingress

# kubectl -n knative-testing describe kingress
Name:         event-display
Namespace:    knative-testing
Labels:       serving.knative.dev/route=event-display
              serving.knative.dev/routeNamespace=knative-testing
              serving.knative.dev/service=event-display
Annotations:  networking.internal.knative.dev/rollout:
                {"configurations":[{"configurationName":"event-display","percent":100,"revisions":[{"revisionName":"event-display-00001","percent":100}],"...
              networking.knative.dev/ingress.class: kourier.ingress.networking.knative.dev
              serving.knative.dev/creator: cluster-admin
              serving.knative.dev/lastModifier: cluster-admin
API Version:  networking.internal.knative.dev/v1alpha1
Kind:         Ingress
Metadata:
  Creation Timestamp:  2023-12-01T16:14:55Z
  Finalizers:
    ingresses.networking.internal.knative.dev
  Generation:  1
  Managed Fields:
    API Version:  networking.internal.knative.dev/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:networking.internal.knative.dev/rollout:
          f:networking.knative.dev/ingress.class:
          f:serving.knative.dev/creator:
          f:serving.knative.dev/lastModifier:
        f:labels:
          .:
          f:serving.knative.dev/route:
          f:serving.knative.dev/routeNamespace:
          f:serving.knative.dev/service:
        f:ownerReferences:
          .:
          k:{"uid":"e6c0abf0-4611-4c23-86f9-4e54a14bfd14"}:
      f:spec:
        .:
        f:httpOption:
        f:rules:
    Manager:      controller
    Operation:    Update
    Time:         2023-12-01T16:14:55Z
    API Version:  networking.internal.knative.dev/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"ingresses.networking.internal.knative.dev":
    Manager:      kourier
    Operation:    Update
    Time:         2023-12-01T16:14:55Z
    API Version:  networking.internal.knative.dev/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:conditions:
        f:observedGeneration:
    Manager:      kourier
    Operation:    Update
    Subresource:  status
    Time:         2023-12-01T16:14:55Z
  Owner References:
    API Version:           serving.knative.dev/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Route
    Name:                  event-display
    UID:                   e6c0abf0-4611-4c23-86f9-4e54a14bfd14
  Resource Version:        620834605
  UID:                     77e9ecd6-78b9-45a8-91a5-bc7c9e22c7e3
Spec:
  Http Option:  Enabled
  Rules:
    Hosts:
      event-display.knative-testing
      event-display.knative-testing.svc
      event-display.knative-testing.svc.k8s.sbd.example.com
    Http:
      Paths:
        Splits:
          Append Headers:
            Knative - Serving - Namespace:  knative-testing
            Knative - Serving - Revision:   event-display-00001
          Percent:                          100
          Service Name:                     event-display-00001
          Service Namespace:                knative-testing
          Service Port:                     80
    Visibility:                             ClusterLocal
Status:
  Conditions:
    Last Transition Time:  2023-12-01T16:14:55Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  LoadBalancerReady
    Last Transition Time:  2023-12-01T16:14:55Z
    Status:                True
    Type:                  NetworkConfigured
    Last Transition Time:  2023-12-01T16:14:55Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  Ready
  Observed Generation:     1
Events:                    <none>

in network policies everything is allowed between namespaces

Please let me know if some extra logs required to provide

Thanks

tokarev-artem avatar Dec 04 '23 09:12 tokarev-artem

baremetal

Does your K8s distribution ship with a LoadBalancer? Our ingress solutions net-* usually create a Service with type LoadBalancer. Check with kubectl get svc -A to see if there is one which is pending.

ReToCode avatar Dec 04 '23 09:12 ReToCode

baremetal

Does your K8s distribution ship with a LoadBalancer? Our ingress solutions net-* usually create a Service with type LoadBalancer. Check with kubectl get svc -A to see if there is one which is pending. thanks for your answer

kubectl -n kourier-system get svc
NAME               TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
kourier            LoadBalancer   10.10.219.159   <pending>     80:39844/TCP,443:33738/TCP   3d18h
kourier-internal   ClusterIP      10.10.231.5     <none>        80/TCP,443/TCP               3d18h

but I have exactly the same behaviour on minikube, where it works without this ussue

tokarev-artem avatar Dec 04 '23 10:12 tokarev-artem

In the end you'll need a Load Balancer if you want to access the services externally. For the other error, please take a look at https://knative.dev/docs/install/troubleshooting/ and check the logs of the controller pods of your ingress controller to see if there is more information on the problem.

ReToCode avatar Dec 04 '23 11:12 ReToCode

I don't think we'll want to access the services externally

More logs, hopefully useful knative-serving/net-kourier-controller

{
    "severity": "error",
    "timestamp": "2023-12-04T11:55:41.065Z",
    "logger": "net-kourier-controller",
    "caller": "status/status.go:405",
    "message": "Probing of http://event-display.knative-testing.svc/ failed, IP: x.x.x.x:8081, ready: false, error: error roundtripping http://event-display.knative-testing.svc/healthz: context deadline exceeded (depth: 0)",
    "commit": "76ffa8f-dirty",
    "knative.dev/controller": "knative.dev.net-kourier.pkg.reconciler.ingress.Reconciler",
    "knative.dev/kind": "networking.internal.knative.dev.Ingress",
    "knative.dev/traceid": "354ad05e-ee07-479c-8240-13707163890d",
    "knative.dev/key": "knative-testing/event-display",
    "stacktrace": "knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\tknative.dev/[email protected]/pkg/status/status.go:405\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\tknative.dev/[email protected]/pkg/status/status.go:290"
}

kubectl describe sks -A

<removed>
Status:
  Conditions:
    Last Transition Time:  2023-12-01T16:14:51Z
    Message:               Revision is backed by Activator
    Reason:                ActivatorEndpointsPopulated
    Severity:              Info
    Status:                True
    Type:                  ActivatorEndpointsPopulated
    Last Transition Time:  2023-12-01T16:15:55Z
    Message:               K8s Service is not ready
    Reason:                NoHealthyBackends
    Status:                Unknown
    Type:                  EndpointsPopulated
    Last Transition Time:  2023-12-01T16:15:55Z
    Message:               K8s Service is not ready
    Reason:                NoHealthyBackends
    Status:                Unknown
    Type:                  Ready
  Observed Generation:     4
  Private Service Name:    event-display-00001-private
  Service Name:            event-display-00001
Events:                    <none>

kubectl describe rt -A

Status:
  Address:
    URL:  http://event-display.knative-testing.svc.k8s.example.com
  Conditions:
    Last Transition Time:  2023-12-01T16:14:55Z
    Status:                True
    Type:                  AllTrafficAssigned
    Last Transition Time:  2023-12-01T16:14:55Z
    Message:               autoTLS is not enabled
    Reason:                TLSNotEnabled
    Status:                True
    Type:                  CertificateProvisioned
    Last Transition Time:  2023-12-01T16:14:55Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  IngressReady
    Last Transition Time:  2023-12-01T16:14:55Z
    Message:               Waiting for load balancer to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  Ready
  Observed Generation:     1
  Traffic:
    Latest Revision:  true
    Percent:          100
    Revision Name:    event-display-00001
  URL:                http://event-display.knative-testing.svc.k8s.example.com
Events:               <none>

tokarev-artem avatar Dec 04 '23 12:12 tokarev-artem

So looking at:

"message": "Probing of http://event-display.knative-testing.svc/ failed, IP: x.x.x.x:8081, ready: false, error: error roundtripping http://event-display.knative-testing.svc/healthz: context deadline exceeded (depth: 0)",

What Pod is the IP? What is in the logs of that IP? Does your Knative Service start up and is healthy + ready? What is in the logs of the Knative Service pod?

ReToCode avatar Dec 04 '23 12:12 ReToCode

What Pod is the IP?

{ "severity": "error", "timestamp": "2023-12-01T15:35:12.841Z", "logger": "net-kourier-controller", "caller": "status/status.go:405", "message": "Probing of http://event-display.knative-testing.svc/ failed, IP: 25.2.20.179:8081, ready: false, error: error roundtripping http://event-display.knative-testing.svc/healthz: dial tcp 25.2.20.179:8081: i/o timeout (depth: 0)", "commit": "76ffa8f-dirty", "knative.dev/controller": "knative.dev.net-kourier.pkg.reconciler.ingress.Reconciler", "knative.dev/kind": "networking.internal.knative.dev.Ingress", "knative.dev/traceid": "0560bc16-b068-4332-aec1-cf6397f3f2b7", "knative.dev/key": "knative-testing/event-display", "stacktrace": "knative.dev/networking/pkg/status.(*Prober).processWorkItem\n\tknative.dev/[email protected]/pkg/status/status.go:405\nknative.dev/networking/pkg/status.(*Prober).Start.func1\n\tknative.dev/[email protected]/pkg/status/status.go:290" }

this is our private range for the cluster, we don't route it outside

Does your Knative Service start up and is healthy + ready?

everything is up and running, without restarts

kubectl -n knative-testing get po 
NAME                                                              READY   STATUS    RESTARTS         AGE
cephsource-my-ceph-source-b5c2382e-9fd4-43be-9a60-2449cb026dzcv   1/1     Running   0                2d20h
test                                                              1/1     Running   246 (109s ago)   2d20h
kubectl -n knative-serving get po 
NAME                                     READY   STATUS    RESTARTS   AGE
activator-65f547fb46-nghgg               1/1     Running   0          2d20h
autoscaler-79c7bdd866-cs8tp              1/1     Running   0          2d21h
controller-9649cdd58-z9wwk               1/1     Running   0          2d20h
net-kourier-controller-f965c4fcc-ctm4t   1/1     Running   0          2d21h
webhook-5476f574b9-rdl59                 1/1     Running   0          2d21h
kubectl -n knative-eventing get po 
NAME                                        READY   STATUS    RESTARTS   AGE
eventing-controller-5db4ddd5f5-pbw4d        1/1     Running   0          2d21h
eventing-webhook-857589c9bb-jb58r           1/1     Running   0          2d21h
kafka-broker-dispatcher-7f85cfdb64-gvzb7    1/1     Running   0          3d
kafka-broker-receiver-84d95b74dd-trlhp      1/1     Running   0          3d
kafka-channel-dispatcher-7694b847d7-6flss   1/1     Running   0          3d
kafka-channel-receiver-d6d9f57bc-2nn8v      1/1     Running   0          3d
kafka-controller-784d64bdf5-qbc4x           1/1     Running   0          3d
kafka-sink-receiver-59cc499fd-f5q8g         1/1     Running   0          3d
kafka-webhook-eventing-5b5d488bc5-jxxfw     1/1     Running   0          3d
kubectl -n kourier-system get po 
NAME                                     READY   STATUS    RESTARTS   AGE
3scale-kourier-gateway-9dfbbd9fc-z65x6   1/1     Running   0          3d20h

What is in the logs of the Knative Service pod?

Do you mean in event-display? if so - its not spawning, because can't be triggered due to invalid service ExternalName

tokarev-artem avatar Dec 04 '23 12:12 tokarev-artem

And found this one

kubectl -n knative-testing get ev
....
11m         Warning   UpdateFailed        cephsource/my-ceph-source                              Failed to update status for "my-ceph-source": cephsources.sources.knative.dev "my-ceph-source" not found
...

but this cephsource exists

kubectl -n knative-testing get cephsource my-ceph-source -o yaml
apiVersion: sources.knative.dev/v1alpha1
kind: CephSource
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"sources.knative.dev/v1alpha1","kind":"CephSource","metadata":{"annotations":{},"name":"my-ceph-source","namespace":"knative-testing"},"spec":{"port":"8888","sink":{"ref":{"apiVersion":"serving.knative.dev/v1","kind":"Service","name":"event-display"}}}}
  creationTimestamp: "2023-12-05T08:11:34Z"
  generation: 1
  name: my-ceph-source
  namespace: knative-testing
  resourceVersion: "624147602"
  uid: 08b1c5af-faa9-4645-a5a4-6e38f76a26ee
spec:
  port: "8888"
  serviceAccountName: default
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: event-display
status: {}

tokarev-artem avatar Dec 05 '23 09:12 tokarev-artem

ready: false, error: error roundtripping http://event-display.knative-testing.svc/healthz: dial tcp 25.2.20.179:8081:

For the pod, I meant, to what pod does this IP belong?

cephsource

I don't think that is the issue. Have you tried with a plain Knative Service like https://knative.dev/docs/getting-started/first-service/?

What is in the logs of the Knative Service pod?

It should spawn when you first create the Knative Service. It will be scaled to zero after 30 seconds. You should be able to take a look at the logs during that time.

Also, there should be errors in the Knative controller pods. Make sure to check for errors in the logs of the pods in knative-serving namespace according to https://knative.dev/docs/install/troubleshooting/. Especially the controller pod and the net-kourier-controller pod. Pretty sure you'll find more info on the issue in there.

ReToCode avatar Dec 06 '23 12:12 ReToCode

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Mar 06 '24 01:03 github-actions[bot]