retina-agent panics when running locally in a Kind cluster
make helm-install-advanced-local-context
Logs:
ts=2024-03-21T20:58:50.234Z level=panic caller=controllermanager/controllermanager.go:118 msg="Error running controller manager" goversion=go1.21.8 os=linux arch=amd64 numcores=16 hostname=backstage-worker podname=retina-agent-88dzr version=v0.0.1 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser error="failed to start plugin manager, plugin exited: failed to start plugin packetparser: interface eth0 of type device not found" errorVerbose="interface eth0 of type device not found\nfailed to start plugin packetparser\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\nfailed to start plugin manager, plugin exited\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:186\ngithub.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:108\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"
panic: Error running controller manager [recovered]
panic: Error running controller manager
goroutine 138 [running]:
github.com/microsoft/retina/pkg/telemetry.TrackPanic()
/go/src/github.com/microsoft/retina/pkg/telemetry/telemetry.go:112 +0x209
panic({0x242fc60?, 0xc003192120?})
/usr/local/go/src/runtime/panic.go:914 +0x21f
go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x1?, 0x0?, {0x0?, 0x0?, 0xc00318e020?})
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:196 +0x54
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0031941a0, {0xc003190380, 0x1, 0x1})
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:262 +0x3ec
go.uber.org/zap.(*Logger).Panic(0xc000493640?, {0x2b48afa?, 0x0?}, {0xc003190380, 0x1, 0x1})
/go/pkg/mod/go.uber.org/[email protected]/logger.go:284 +0x51
github.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start(0xc000d01cc0, {0x2f057d0?, 0xc000836320?})
/go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:118 +0x28c
created by main.main in goroutine 1
/go/src/github.com/microsoft/retina/controller/main.go:286 +0x2825
make helm-install-advanced-local-contextLogs:
ts=2024-03-21T20:58:50.234Z level=panic caller=controllermanager/controllermanager.go:118 msg="Error running controller manager" goversion=go1.21.8 os=linux arch=amd64 numcores=16 hostname=backstage-worker podname=retina-agent-88dzr version=v0.0.1 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser error="failed to start plugin manager, plugin exited: failed to start plugin packetparser: interface eth0 of type device not found" errorVerbose="interface eth0 of type device not found\nfailed to start plugin packetparser\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\nfailed to start plugin manager, plugin exited\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:186\ngithub.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:108\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650" panic: Error running controller manager [recovered] panic: Error running controller manager goroutine 138 [running]: github.com/microsoft/retina/pkg/telemetry.TrackPanic() /go/src/github.com/microsoft/retina/pkg/telemetry/telemetry.go:112 +0x209 panic({0x242fc60?, 0xc003192120?}) /usr/local/go/src/runtime/panic.go:914 +0x21f go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x1?, 0x0?, {0x0?, 0x0?, 0xc00318e020?}) /go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:196 +0x54 go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0031941a0, {0xc003190380, 0x1, 0x1}) /go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:262 +0x3ec go.uber.org/zap.(*Logger).Panic(0xc000493640?, {0x2b48afa?, 0x0?}, {0xc003190380, 0x1, 0x1}) /go/pkg/mod/go.uber.org/[email protected]/logger.go:284 +0x51 github.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start(0xc000d01cc0, {0x2f057d0?, 0xc000836320?}) /go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:118 +0x28c created by main.main in goroutine 1 /go/src/github.com/microsoft/retina/controller/main.go:286 +0x2825
Hi @shashankram ,
Adding some details about the error - the Packetparser plugin expects a node to have the eth0 device to be present and tries to attach tc programs to it.
Two observations:
Packetparsershouldn't panic ifeth0is not present, it should warn and move on. We should definitely fix this in code. Will track the fix using this issue.- Even with that fix, Retina may not work as expected on your Kind setup. The plugins are dependent on the underlying host kernel. For example, if the Kind cluster is running on docker installed on a Windows machine, Retina won't run successfully.
For anyone working on this fix, error handling to be done here - https://github.com/microsoft/retina/blob/0b8a44caf1fa073cca19649e493b0a66d5416822/pkg/plugin/packetparser/packetparser_linux.go#L215C3-L215C13
@aman952036
Have same issue trying install v0.0.4 on my k8s-cluster
Trying installing retina using this refrence on k8s version v1.28.6 but retina-agent always Crashloopbackoff
Cluster
Step install
VERSION=$( curl -sL https://api.github.com/repos/microsoft/retina/releases/latest | jq -r .name)
helm upgrade --install retina oci://ghcr.io/microsoft/retina/charts/retina \
--version $VERSION \
--namespace kube-system \
--set image.tag=$VERSION \
--set operator.tag=$VERSION \
--set image.pullPolicy=Always \
--set logLevel=info \
--set os.windows=false \ # set to false
--set operator.enabled=true \
--set operator.enableRetinaEndpoint=true \
--set enabledPlugin_linux="\[dropreason\,packetforward\,linuxutil\,dns\,packetparser\]" \
--set enablePodLevel=true \
--set enableAnnotations=true
After install
Evidence Logs
Checking logs using --previous
ts=2024-04-01T12:54:09.716Z level=error caller=pluginmanager/pluginmanager.go:185 msg="plugin manager exited with error" goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser error="failed to start plugin packetparser: interface eth0 of type device not found" errorVerbose="interface eth0 of type device not found\nfailed to start plugin packetparser\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"
ts=2024-04-01T12:54:09.716Z level=info caller=server/server.go:79 msg="gracefully shutting down HTTP server..." goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser
ts=2024-04-01T12:54:09.716Z level=info caller=watchermanager/watchermanager.go:71 msg="watcher stopping..." goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser watcher_type=*endpoint.EndpointWatcher
ts=2024-04-01T12:54:09.716Z level=info caller=server/server.go:71 msg="HTTP server stopped with err: http: Server closed" goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser
ts=2024-04-01T12:54:09.716Z level=info caller=watchermanager/watchermanager.go:71 msg="watcher stopping..." goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser watcher_type=*apiserver.ApiServerWatcher
ts=2024-04-01T12:54:09.716Z level=panic caller=controllermanager/controllermanager.go:119 msg="Error running controller manager" goversion=go1.21.8 os=linux arch=amd64 numcores=4 hostname=nb-k8s-controlplane-1 podname=retina-agent-25kqb version=v0.0.4 apiserver=https://10.96.0.1:443 plugins=dropreason,packetforward,linuxutil,dns,packetparser error="failed to start plugin manager, plugin exited: failed to start plugin packetparser: interface eth0 of type device not found" errorVerbose="interface eth0 of type device not found\nfailed to start plugin packetparser\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\nfailed to start plugin manager, plugin exited\ngithub.com/microsoft/retina/pkg/managers/pluginmanager.(*PluginManager).Start\n\t/go/src/github.com/microsoft/retina/pkg/managers/pluginmanager/pluginmanager.go:186\ngithub.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start.func1\n\t/go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:109\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"
panic: Error running controller manager
goroutine 46 [running]:
go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x1?, 0x0?, {0x0?, 0x0?, 0xc003baa120?})
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:196 +0x54
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc003bc00d0, {0xc003bb46c0, 0x1, 0x1})
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:262 +0x3ec
go.uber.org/zap.(*Logger).Panic(0xc000c4d400?, {0x2b52d30?, 0x0?}, {0xc003bb46c0, 0x1, 0x1})
/go/pkg/mod/go.uber.org/[email protected]/logger.go:284 +0x51
github.com/microsoft/retina/pkg/managers/controllermanager.(*Controller).Start(0xc0008dcfa0, {0x2f10a90?, 0xc000c4a870?})
/go/src/github.com/microsoft/retina/pkg/managers/controllermanager/controllermanager.go:119 +0x28c
created by main.main in goroutine 1
/go/src/github.com/microsoft/retina/controller/main.go:290 +0x28d0
same issue