beats icon indicating copy to clipboard operation
beats copied to clipboard

Managed Beats panic on termination due to close of closed channel

Open adriansr opened this issue 2 years ago • 2 comments

For confirmed bugs, please report:

  • Version: 8.3.0+
  • Operating System: Linux
  • Discuss Forum URL:
  • Steps to Reproduce:

When Beats are managed by Elastic Agent, they are panicking on termination due to close of closed channel, which suggest termination is triggered twice.

fleet-server_1 | {"log.level":"error","@timestamp":"2022-08-02T08:28:58.942Z","log.origin":{"file.name":"process/stdlogger.go","file.line":54},"message":"filebeat_monitoring stderr: "panic: close of closed channel\n\ngoroutine 55 [running]:\ngithub.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Stop(0xc0005fbc50)\n\t/go/src/github.com/elastic/beats/filebeat/beater/filebeat.go:428 +0x3f\ngithub.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch.func5()\n\t/go/src/github.com/elastic/beats/libbeat/cmd/instance/beat.go:461 +0x55\nsync.(*Once).doSlow(0x0?, 0x0?)\n\t/usr/local/go/src/sync/once.go:68 +0xc2\n"","agent.console.name":"filebeat_monitoring","agent.console.type":"stderr","ecs.version":"1.6.0"} fleet-server_1 | {"log.level":"error","@timestamp":"2022-08-02T08:28:58.943Z","log.origin":{"file.name":"process/stdlogger.go","file.line":54},"message":"filebeat_monitoring stderr: "sync.(*Once).Do(...)\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/elastic/elastic-agent-libs/service.HandleSignals.func1()\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:60 +0x19c\ncreated by github.com/elastic/elastic-agent-libs/service.HandleSignals\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:49 +0x18f\n"","agent.console.name":"filebeat_monitoring","agent.console.type":"stderr","ecs.version":"1.6.0"}

and

metricbeat_monitoring stderr: "panic: close of closed channel\n\ngoroutine 171 [running]:\n" metricbeat_monitoring stderr: "github.com/elastic/beats/v7/metricbeat/beater.(*Metricbeat).Stop(0x40003fe780?)\n\t/go/src/github.com/elastic/beats/metricbeat/beater/metricbeat.go:276 +0x24\ngithub.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch.func5()\n\t/go/src/github.com/elastic/beats/libbeat/cmd/instance/beat.go:461 +0x54\nsync.(*Once).doSlow(0x400071caa0?, 0x400012a3ff?)\n\t/usr/local/go/src/sync/once.go:68 +0x10c\nsync.(*Once).Do(...)\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/elastic/elastic-agent-libs/service.HandleSignals.func1()\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:60 +0x168\ncreated by github.com/elastic/elastic-agent-libs/service.HandleSignals\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:49 +0x168\n" filebeat_monitoring stderr: "panic: close of closed channel\n\ngoroutine 54 [running]:\ngithub.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Stop(0x400036bd10)\n\t/go/src/github.com/elastic/beats/filebeat/beater/filebeat.go" filebeat_monitoring stderr: ":428 +0x48\ngithub.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch.func5()\n\t/go/src/github.com/elastic/beats/libbeat/cmd/instance/beat.go:461 +0x54\nsync.(*Once).doSlow(0x4000122788?, 0x6163206f742064ff?)\n\t/usr/local/go/src/sync/once.go:68 +0x10c\nsync.(*Once).Do(...)\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/elastic/elastic-agent-libs/service.HandleSignals.func1()\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:60 +0x168\ncreated by github.com/elastic/elastic-agent-libs/service.HandleSignals\n\t/go/pkg/mod/github.com/elastic/[email protected]/service/service.go:49 +0x168\n"

I've reproduced this with all 8.3.x versions. 8.2.2 doesn't have this issue, so this may be related to the refactor of some termination logic into the elastic-agent-libs repo.

adriansr avatar Aug 04 '22 14:08 adriansr

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

elasticmachine avatar Aug 04 '22 14:08 elasticmachine

I talked to @AndersonQ before about this. As far as I understand this defect might have have been there before, but we just didn't capture stderr output into log before.

aleksmaus avatar Aug 04 '22 15:08 aleksmaus

is this bugs fixed or still there? I got the es agent 8.4.x, seeing the same error

viszsec avatar Sep 09 '22 09:09 viszsec

It's still there, however so far we have no evidence of any negative impact caused by that. The issue has existed for a long time, it was just surfaced when the Elastic Agent started to collect the stderr and stdout of the beats, see https://github.com/elastic/elastic-agent/pull/455.

Apart from the error logs, have you experienced any issue you believe to caused by this panic @viszsec?

AndersonQ avatar Sep 09 '22 12:09 AndersonQ

It's still there, however so far we have no evidence of any negative impact caused by that. The issue has existed for a long time, it was just surfaced when the Elastic Agent started to collect the stderr and stdout of the beats, see elastic/elastic-agent#455.

Apart from the error logs, have you experienced any issue you believe to caused by this panic @viszsec?

No negative impacts found thus far. It just the errors is written there as it is.

viszsec avatar Dec 13 '22 06:12 viszsec

Fixed in https://github.com/elastic/beats/pull/33971 for the 8.6 release

cmacknz avatar Dec 21 '22 14:12 cmacknz