client-go icon indicating copy to clipboard operation
client-go copied to clipboard

Runtime error when recording events for deleted resources

Open zongzw opened this issue 2 years ago • 2 comments

Hi,

The issue always happens to me with the yaml and the demo code. To simply demo the error, I summarized it https://github.com/zongzw/client-go-helloworld/

Versions:

k8s.io/api v0.24.1
k8s.io/client-go v0.24.1

Code For Reproducing:

In this repo, I was:

  • Using client-go to watch all configmap changes in my k8s environment;
    sharedInformerFactory := informers.NewSharedInformerFactoryWithOptions(
        clientset,
        30*time.Second,
    )
    cfgInformer = sharedInformerFactory.Core().V1().ConfigMaps().Informer()
    cfgInformer.AddEventHandler(&cache.ResourceEventHandlerFuncs{
    
  • When new changes come, record an event with the regarding object as that configmap.
    queue <- RecordObj{
         object:  abc,
        reason:  "Deleted",
             message: "object deletion has been notified by informer",
    }
    
    ... later 
    
    func recordDaemon(stopCh <-chan struct{}) {
        for {
    	select {
    	case <-stopCh:
    		return
    	case r := <-queue:
    		recorder.Event(r.object, v1.EventTypeNormal, r.reason, r.message)
    	}
        }
    }
    

Reproducing Steps:

  1. I defined my resources and their namespace into the test.yaml file, so that I can kubectl apply or kubectl delete them together.

  2. It run well when kubectl apply them, but when I kubectl delete them, it reports a runtime error, the program panic.

Some Initial Findings/Locatings:

The panic happens from fmt package when it tried to fmtS/print *k8s.io/apimachinery/pkg/apis/meta/v1.Time nil

The call stack is

k8s.io/apimachinery/pkg/apis/meta/v1.(*Time).GoString (:1)
fmt.(*pp).handleMethods (/usr/local/go/src/fmt/print.go:603)
fmt.(*pp).printValue (/usr/local/go/src/fmt/print.go:723)
fmt.(*pp).printValue (/usr/local/go/src/fmt/print.go:806)
fmt.(*pp).printValue (/usr/local/go/src/fmt/print.go:806)
fmt.(*pp).printValue (/usr/local/go/src/fmt/print.go:876)
fmt.(*pp).printArg (/usr/local/go/src/fmt/print.go:712)
fmt.(*pp).doPrintf (/usr/local/go/src/fmt/print.go:1026)
fmt.Fprintf (/usr/local/go/src/fmt/print.go:204)
k8s.io/klog/v2.(*loggingT).printfDepth (/Users/zong/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:626)
k8s.io/klog/v2.(*loggingT).printf (/Users/zong/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:612)
k8s.io/klog/v2.Errorf (/Users/zong/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1458)
k8s.io/client-go/tools/record.recordEvent (/Users/zong/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:267)
k8s.io/client-go/tools/record.recordToSink (/Users/zong/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:216)
k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartRecordingToSink.func1 (/Users/zong/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:194)
k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1 (/Users/zong/go/pkg/mod/k8s.io/[email protected]/tools/record/event.go:311)
runtime.goexit (/usr/local/go/src/runtime/asm_amd64.s:1581)

Because the namespace was deleted, some event recordings fail with error: events \"cm-canary-1.16fee48bbe2efa10\" is forbidden: unable to create new content in namespace namespace-test-del because it is being terminated.

The raw error is:

k8s.io/apimachinery/pkg/apis/meta/v1.Status {TypeMeta: k8s.io/apimachinery/pkg/apis/meta/v1.TypeMeta {Kind: "", APIVersion: ""}, ListMeta: k8s.io/apimachinery/pkg/apis/meta/v1.ListMeta {SelfLink: "", ResourceVersion: "", Continue: "", RemainingItemCount: *int64 nil}, Status: "Failure", Message: "events \"cm-canary-1.16fee48bbe2efa10\" is forbidden: unable to create new content in namespace namespace-test-del because it is being terminated", Reason: "Forbidden", Details: *k8s.io/apimachinery/pkg/apis/meta/v1.StatusDetails {Name: "cm-canary-1.16fee48bbe2efa10", Group: "", Kind: "events", UID: "", Causes: []k8s.io/apimachinery/pkg/apis/meta/v1.StatusCause len: 1, cap: 4, [(*"k8s.io/apimachinery/pkg/apis/meta/v1.StatusCause")(0xc0002003c0)], RetryAfterSeconds: 0}, Code: 403}

My Question:

So, here I wonder is it a bug or some misuse of it? Or is it a fmt bug?

zongzw avatar Jul 05 '22 10:07 zongzw

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 03 '22 12:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 02 '22 12:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Dec 02 '22 13:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 02 '22 13:12 k8s-ci-robot