serving icon indicating copy to clipboard operation
serving copied to clipboard

Service endpoints are not updated / removed after upgrade to Kubernetes 1.28

Open mbrancato opened this issue 5 months ago • 4 comments

What version of Knative?

0.15.2

Expected Behavior

endpoints should update properly

Actual Behavior

Endpoints for a service are not getting updated on scale down operation or pod deletes. This leaves a lot of incorrect values in the endpoints. The propagates to the public service as well.

% kubectl -n detection get endpoints my-app-00112-private
NAME                      ENDPOINTS                                                              AGE
my-app-00112-private   10.32.101.40:9091,10.32.101.41:9091,10.32.101.43:9091 + 5997 more...   136m

% kubectl -n detection get deploy my-app-00112-deployment
NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
my-app-00112-deployment   2/2     2            2           136m

I was able to get logs like this from SKS:

{
apiVersion: "v1"
eventTime: null
involvedObject: {
apiVersion: "networking.internal.knative.dev/v1alpha1"
kind: "ServerlessService"
name: "my-app-00112"
namespace: "detection"
resourceVersion: "6779758389"
uid: "f6ed0598-0171-43ff-bf7a-c45069fdcbe2"
}
kind: "Event"
lastTimestamp: "2024-09-14T15:38:13Z"
message: "SKS: my-app-00112 does not own Service: my-app-00112-private"
metadata: {
creationTimestamp: "2024-09-14T15:38:13Z"
managedFields: [1]
name: "my-app-00112.17f5266fbfda92c2"
namespace: "detection"
resourceVersion: "3317050884"
uid: "20dcc671-4abb-490c-aff8-7404dfdf8063"
}
reason: "InternalError"
reportingComponent: "serverlessservice-controller"
reportingInstance: ""
source: {
component: "serverlessservice-controller"
}
type: "Warning"
}
logName: "projects/my-project-92384924/logs/events"
receiveTimestamp: "2024-09-14T15:38:13.778779952Z"
resource: {
labels: {
cluster_name: "my-cluster-192132"
location: "us-central1-c"
project_id: "my-project-92384924"
}
type: "k8s_cluster"
}
severity: "WARNING"
timestamp: "2024-09-14T15:38:13Z"
}

Steps to Reproduce the Problem

This happens with all our ksvc that scale up and then down or have pods removed (via delete / evict).

mbrancato avatar Sep 14 '24 18:09 mbrancato