java icon indicating copy to clipboard operation
java copied to clipboard

Add SharedInformer and SharedInformerFactory setDebugItems

Open codefromthecrypt opened this issue 1 year ago • 9 comments
trafficstars

This centralizes logging into the existing ReflectorRunnable type, notably adding conditional logging of events.

This is toggled by setDebugItems(bool) on SharedInformer and SharedInformerFactory as ReflectorRunnable isn't directly accessible by end users.

Now, you can configure these independent of ApiType or if your needs are simple, use the logger.

boolean shouldDebugItems = LogFactory.getLog(ReflectorRunnable.class).isDebugEnabled();
sharedInformerFactory.setDebugItems(shouldDebugItems);
serviceInformer.setDebugItems(shouldDebugItems);
endpointsInformer.setDebugItems(shouldDebugItems);

Then, you can see streaming updates like this (in this case, I deleted a pod intentionally so it would be recreated):

2024-02-28T01:29:56.980Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154844
2024-02-28T01:30:06.582Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Next item class V1Endpoints {
    apiVersion: v1
    kind: Endpoints
    metadata: class V1ObjectMeta {
        annotations: {endpoints.kubernetes.io/last-change-trigger-time=2024-02-28T01:30:06Z}
        creationTimestamp: 2024-02-27T05:54:17Z
        deletionGracePeriodSeconds: null
        deletionTimestamp: null
        finalizers: null
        generateName: null
        generation: null
        labels: {app.kubernetes.io/instance=zipkin, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=zipkin, app.kubernetes.io/version=3.0.6, helm.sh/chart=zipkin-0.2.0}
        managedFields: [class V1ManagedFieldsEntry {
            apiVersion: v1
            fieldsType: FieldsV1
            fieldsV1: {f:metadata={f:annotations={.={}, f:endpoints.kubernetes.io/last-change-trigger-time={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:subsets={}}
            manager: k3s
            operation: Update
            subresource: null
            time: 2024-02-28T01:30:06Z
        }]
        name: zipkin
        namespace: default
        ownerReferences: null
        resourceVersion: 154855
        selfLink: null
        uid: 462ae678-e912-4491-bc93-3851580cae10
    }
    subsets: [class V1EndpointSubset {
        addresses: [class V1EndpointAddress {
            hostname: null
            ip: 10.42.0.225
            nodeName: colima
            targetRef: class V1ObjectReference {
                apiVersion: null
                fieldPath: null
                kind: Pod
                name: zipkin-57667879f-kc49k
                namespace: default
                resourceVersion: null
                uid: 8670d47d-7582-4d0c-93b3-cff41d8a5a50
            }
        }]
        notReadyAddresses: null
        ports: [class CoreV1EndpointPort {
            appProtocol: null
            name: http-query
            port: 9411
            protocol: TCP
        }]
    }]
}
2024-02-28T01:30:06.586Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154855

This also reduces the verbosity of log lines, by making the log tag match the simple name of the type instead of the type's toString(). As a side effect, this also matches the item toString values who also use the simple name.

For example, without this change, the log message prefix is extremely long and includes "class io.kubernetes.client.openapi.models." which isn't needed to disambiguate types watched.

2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Extract resourceVersion 154265 list meta
2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Initial items [class V1Service {

codefromthecrypt avatar Feb 27 '24 15:02 codefromthecrypt

ps happy to add integration tests if this feels helpful. cc @spencergibb @wind57 @ryanjbaxter as I feel with this log detail, it is a lot easier to reason with the result of API responses in spring-cloud-kubernetes-discoveryserver. Totally ack that in a realistic size cluster, this is way too much logging.

codefromthecrypt avatar Feb 27 '24 15:02 codefromthecrypt

Generally this looks ok to me. I do think that it is going to introduce a bunch of log spam in large clusters.

Perhaps we want the ability to turn it on/off for specific watches?

other than that lgtm.

brendandburns avatar Feb 27 '24 17:02 brendandburns

Thanks for the review @brendandburns. In analyzing how to propagate a "trace logging" flag, it seems the most sensible place to instrument is ReflectorRunnable where all the other logging is happening, and where the tagging (to manage the interleaving issue) is going on. I'll post a different version in the next commit.

codefromthecrypt avatar Feb 27 '24 22:02 codefromthecrypt

Changed impl and description, PTAL. If keen, I'll polish with tests.

codefromthecrypt avatar Feb 28 '24 01:02 codefromthecrypt

@yue9944882 hate to thrash, but another way is to add an ItemListener that gets these instead, and then someone can make a logging variant of it? (could also make a tracing variant of it, too)

codefromthecrypt avatar Feb 28 '24 03:02 codefromthecrypt

ps ignore my last comment. I violated the KISS principle. This change fits in with existing practice, is simple and doesn't prevent something more sophisticated later.

codefromthecrypt avatar Feb 28 '24 05:02 codefromthecrypt

on soft approval from both folks, I'll add tests!

codefromthecrypt avatar Feb 28 '24 05:02 codefromthecrypt

New changes are detected. LGTM label has been removed.

k8s-ci-robot avatar Mar 05 '24 17:03 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: codefromthecrypt Once this PR has been reviewed and has the lgtm label, please ask for approval from yue9944882. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Mar 05 '24 17:03 k8s-ci-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 03 '24 23:06 k8s-triage-robot