cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

kvserver: add observability into lockedMuxStream

Open wenyihu6 opened this issue 7 months ago • 1 comments

Is your feature request related to a problem? Please describe.

Rangefeeds under the same node.MuxRangeFeed shares the same lockedMuxStream https://github.com/cockroachdb/cockroach/blob/692b93c1221acd004f4ea9f25531868e585d899e/pkg/server/node.go#L2179. We have seen in escalations where we are waiting to acquire a mutex lock for substantial amount of time. This can be due to the client side being really slow at admitting events. It can be hard to debug without looking at goroutine dump. We should add more observability into this:

goroutine 1535831952 [sync.Mutex.Lock, 17 minutes]:
sync.runtime_SemacquireMutex(0x479498?, 0x78?, 0xc0b97567c0?)
	GOROOT/src/runtime/sema.go:77 +0x25
sync.(*Mutex).lockSlow(0xc0ff8cd570)
	GOROOT/src/sync/mutex.go:171 +0x15d
sync.(*Mutex).Lock(...)
	GOROOT/src/sync/mutex.go:90
github.com/cockroachdb/cockroach/pkg/server.(*lockedMuxStream).Send(0xc08e0fb808?, 0xc0b97567c0)
	pkg/server/node.go:2041 +0x5c

Related escalation: https://github.com/cockroachlabs/support/issues/3287

Jira issue: CRDB-50534

wenyihu6 avatar May 09 '25 17:05 wenyihu6

O-support for https://github.com/cockroachlabs/support/issues/3342#issuecomment-2983907045

tbg avatar Jun 18 '25 14:06 tbg

This got done in https://github.com/cockroachdb/cockroach/pull/147440, insure why escalate disagrees.

tbg avatar Jun 30 '25 15:06 tbg