dd-trace-go icon indicating copy to clipboard operation
dd-trace-go copied to clipboard

[BUG] unexpected goroutines remain in service after upgrading from 1.69.1 to 1.70.1

Open bertold opened this issue 1 year ago • 4 comments

Version of dd-trace-go 1.70.1

Describe what happened: After updating our services depending on dd-trace-go that rely on APM features as, we detected leaks in go routines. Likely caused by te semlog update

goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 7 in state sync.Cond.Wait, with sync.runtime_notifyListWait on top of the stack:
sync.runtime_notifyListWait(0xc0001e1490, 0x0)
	/usr/local/go/src/runtime/sema.go:587 +0x159
sync.(*Cond).Wait(0xc0001e1480)
	/usr/local/go/src/sync/cond.go:71 +0x75
github.com/cihub/seelog.(*asyncLoopLogger).processItem(0xc0000fb4d0)
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:50 +0x159
github.com/cihub/seelog.(*asyncLoopLogger).processQueue(0xc0000fb4d0)
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:63 +0x37
created by github.com/cihub/seelog.NewAsyncLoopLogger in goroutine 1
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:40 +0x11d
 Goroutine 8 in state sync.Cond.Wait, with sync.runtime_notifyListWait on top of the stack:
sync.runtime_notifyListWait(0xc0001e1610, 0x0)
	/usr/local/go/src/runtime/sema.go:587 +0x159
sync.(*Cond).Wait(0xc0001e1600)
	/usr/local/go/src/sync/cond.go:71 +0x75
github.com/cihub/seelog.(*asyncLoopLogger).processItem(0xc0000fb5f0)
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:50 +0x159
github.com/cihub/seelog.(*asyncLoopLogger).processQueue(0xc0000fb5f0)
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:63 +0x37
created by github.com/cihub/seelog.NewAsyncLoopLogger in goroutine 1
	/builds/verosint/back-end/signalapi/.go/pkg/mod/github.com/cihub/[email protected]/behavior_asynclooplogger.go:40 +0x11d
]

Describe what you expected: No goroutine leaks.

Steps to reproduce the issue: Running unit tests. Not easy to provide sample here.

Additional environment details (Version of Go, Operating System, etc.): Go version: 1.23.3 OS: Linux/amd64

bertold avatar Nov 25 '24 15:11 bertold

Thanks for reporting this.

This is a known issue in the seelog library which was introduced as an indirect dependency by https://github.com/DataDog/dd-trace-go/pull/2817:

$ pwd
/go/src/github.com/DataDog/dd-trace-go
$ go mod why github.com/cihub/seelog
# github.com/cihub/seelog
gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer
github.com/DataDog/datadog-agent/pkg/trace/stats
github.com/DataDog/datadog-agent/pkg/trace/config
github.com/DataDog/datadog-agent/pkg/util/log
github.com/cihub/seelog

Since you're using goleak, would you be okay with using the same workaround as dd-trace-go is currently using?

func TestMain(m *testing.M) {
	// TODO: seelog (indirect dependency) has a known goroutine leak where it leaks a single goroutine on init (https://github.com/cihub/seelog/issues/182)
	goleak.VerifyTestMain(m, goleak.IgnoreAnyFunction("github.com/cihub/seelog.(*asyncLoopLogger).processQueue"))
}

Ideally we would fix this on our end, but it's going to require a bigger effort. cc @ajgajg1134

felixge avatar Nov 26 '24 10:11 felixge

Am I missing a context here? dd-trace-go is owned by DataDog org. It seems that datadog-agent is also under the same org. Should the datadog-agent be fixed? If that is not possible, maybe dd-trace-go should hold off updating this component until the leak is fixed?

bertold avatar Nov 26 '24 17:11 bertold

The team is actively working on this issue, and while we cannot remove the new dependency on datadog-agent, we will fork github.com/cihub/seelog repository and fix the leak there.

ichinaski avatar Nov 27 '24 14:11 ichinaski

Any updates on this? We're still goroutine leaks in the latest release.

pje avatar Mar 03 '25 15:03 pje

Hi @pje and everybody: @felixge implemented a solution to avoid the leaking goroutine while the team maintaining datadog-agent can complete the migration from seelog.

This will be released as part of our v2.2.2 release.

darccio avatar Jul 30 '25 12:07 darccio

https://github.com/DataDog/dd-trace-go/releases/tag/v2.2.2 has been released, closing this

felixge avatar Sep 28 '25 18:09 felixge