cilium-cli
                                
                                 cilium-cli copied to clipboard
                                
                                    cilium-cli copied to clipboard
                            
                            
                            
                        Use ephemeral containers during sysdump if Cilium is stuck in crashloop
Currently bugtool info for Cilium agent is missing from sysdump for Cilium agents in crashloop. A lot of helpful information (e.g., open sockets, iptables, etc) could be collected also from nodes where Cilium agent fails to start. Would it be possible to run a job in the node with a bugtool/bpftool image to collect the current node state in cases when cilium pod fails to start?
Currently bugtool info for Cilium agent is missing from sysdump for Cilium agents in crashloop. A lot of helpful information (e.g., open sockets, iptables, etc) could be collected also from nodes where Cilium agent fails to start. Would it be possible to run a job in the node with a bugtool/bpftool image to collect the current node state in cases when cilium pod fails to start?
Yes it is possible if a) the cluster supports ephemeral containers: https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/ or b) run a Deployment in the node(s) that are selected by a specific label, or even all nodes, that runs the bugtool in those nodes.
^ Good idea. I've updated the issue to reflect this feature request and transferring it to the CLI repo as that's where sysdump lives.