cli icon indicating copy to clipboard operation
cli copied to clipboard

panic: "send on closed channel" when viewing logs

Open sudo-bmitch opened this issue 3 years ago • 2 comments

Versions and Operating System

  • Kubernetes version:

    Output of kubectl version:

    $ kubectl version
    Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:41:01Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
    
  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

    $ tkn version
    Client version: 0.23.1
    Chains version: v0.10.0
    Pipeline version: v0.36.0
    Dashboard version: v0.26.0
    
  • Operating System:

    $ uname -a
    Linux ssf 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux
    

Expected Behavior

$ tkn pr logs --last -f
... log output ...
[clone-repo : clone] + printf '%s' 2a97d22ba282accad50821bca069210ea89de706                              
[clone-repo : clone] + printf '%s' https://github.com/IBM/tekton-tutorial-openshift                      
                                                                                                         
[build-and-push-image : write-url] ttl.sh/8b7f3b2ff1c54b270be97da02eb9aeb9/hello-frsca:1h                
                                                                                                         
[vulnerability-scan : grype-scanner] [0000]  INFO grype version: 0.35.0                                  
[vulnerability-scan : grype-scanner] [0000]  INFO New version of grype is available: 0.43.0   
... more log output ...

Actual Behavior

$ tkn pr logs --last -f
... log output ...
[clone-repo : clone] + '[' 0 '!=' 0 ]                                                                    
[clone-repo : clone] + printf '%s' 2a97d22ba282accad50821bca069210ea89de706                              
[clone-repo : clone] + printf '%s' https://github.com/IBM/tekton-tutorial-openshift                      
                                                    
[build-and-push-image : write-url] ttl.sh/8b7f3b2ff1c54b270be97da02eb9aeb9/hello-frsca:1h                
                                                                                                         
E0721 15:43:43.168531   67896 runtime.go:78] Observed a panic: "send on closed channel" (send on closed channel)
goroutine 224 [running]:                            
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x25fcaa0, 0x2ef2d30})                                    
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x7d
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x1617657})                                                                                                                                                    /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x25fcaa0, 0x2ef2d30})                                                                                                                                                                                              /usr/local/go/src/runtime/panic.go:1038 +0x215                                                   
github.com/tektoncd/cli/pkg/pods.(*Pod).watcher.func2({0x405ef8, 0xc0007fd620}, {0x29e3400, 0xc000ed2c00})                                                                 
        /workspace/src/github.com/tektoncd/cli/pkg/pods/pod.go:107 +0x36                                 
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(...)                                     
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/controller.go:238     
k8s.io/client-go/tools/cache.(*processorListener).run.func1()                                            
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775 +0x127
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7f7645dadf80)                                     
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x4694a7, {0x2f247e0, 0xc000fd3260}, 0x1, 0xc0007fda40)   
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0xc000b2cf88)                   
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)                                                             
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90       
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0005ad900)                                      
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()                                                                                                                                                           
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start                                              
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x88                                                                                                           panic: send on closed channel [recovered]           
        panic: send on closed channel                                                                                                                                                                                                                                                                                       
goroutine 224 [running]:                            
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x1617657})                                  
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0xd8
panic({0x25fcaa0, 0x2ef2d30})                       
        /usr/local/go/src/runtime/panic.go:1038 +0x215                                                   
github.com/tektoncd/cli/pkg/pods.(*Pod).watcher.func2({0x405ef8, 0xc0007fd620}, {0x29e3400, 0xc000ed2c00})
        /workspace/src/github.com/tektoncd/cli/pkg/pods/pod.go:107 +0x36                                 
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(...)                                     
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/controller.go:238     
k8s.io/client-go/tools/cache.(*processorListener).run.func1()                                            
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775 +0x127
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7f7645dadf80)                                     
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x4694a7, {0x2f247e0, 0xc000fd3260}, 0x1, 0xc0007fda40)   
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0xc000b2cf88)                   
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)        
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90       
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0005ad900)                                      
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()                                                 
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start                                              
        /workspace/src/github.com/tektoncd/cli/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x88

Steps to Reproduce the Problem

  1. tkn pr logs --last -f
  2. retry 1 until you find the race condition.

Additional Info

This looks like a duplicate of #1080 and #1132, but they've aged out. Note in the logs, it was just getting to the end of one container's logs right as the screen was catching up from me running the command, so perhaps a race condition either in switching container logs, or in switching from existing logs to following the new output right as a container exits.

sudo-bmitch avatar Jul 21 '22 19:07 sudo-bmitch

This is pointing to: https://github.com/tektoncd/cli/blob/d8f44287ceafb1c84c20b94ad00555dbe7d22ede/pkg/pods/pod.go#L107

I think we need to make sure the watcher is stopped before exiting this function since the defer will close the eventC channel: https://github.com/tektoncd/cli/blob/d8f44287ceafb1c84c20b94ad00555dbe7d22ede/pkg/pods/pod.go#L79-L95

sudo-bmitch avatar Jul 21 '22 20:07 sudo-bmitch

/assign

pradeepitm12 avatar Jul 26 '22 12:07 pradeepitm12

@sudo-bmitch Thank for raising the issue and also for the suggestion to fix it. But Watcher is not a channel which we can close. But we can change the sequence

defer close(eventC) 
defer close(stopC)

to

defer close(stopC) 
defer close(eventC) 

The sharedInformer is stopped ones it receives on stopC eventC is closed first and then we close the stopC will stop the sharedInformer at the end of the func Your thoughts on this would be helpful.

pradeepitm12 avatar Aug 22 '22 13:08 pradeepitm12

I have my doubts the sharedinformer would stop immediately. Typically with channels like that there is code that gets run between requests to check the channel, so there will be a delay between the stopC and the last possible write to the eventC. If you close both, the order of the close is only going to change the size of the race condition. If the component that is writing to the channel isn't the one to close it, then you'd want a mutex and a flag to indicate the channel is not closed before attempting to write to the channel.

sudo-bmitch avatar Aug 22 '22 18:08 sudo-bmitch