cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

serverccl: package times out during shutdown causing flakes

Open rafiss opened this issue 3 years ago • 1 comments

Describe the problem

The serverccl package has been timing out.

One theory:

Seems like the server fails to shut down because we’re waiting for quiescence but the contexts are not getting canceled correctly or something and we’re in an infinite retry inside kv/kvclient/rangecache.(*RangeCache).tryLookup.

Using a bisect I landed on 262a70d506e0b1f14ac1ba4ab831885c26bcd901 as the first bad commit, but I don't see why.

To Reproduce

The TestNoInflightTracesVirtualTableOnTenant test reproes it.

./dev test pkg/ccl/serverccl --stress --filter=TestNoInflightTracesVirtualTableOnTenant --timeout=2m --test-args='-test.timeout 20s'

Jira issue: CRDB-18494

rafiss avatar Aug 11 '22 03:08 rafiss

The below stack trace is telling.

* goroutine 53508 [select]:
* github.com/cockroachdb/cockroach/pkg/util/retry.(*Retry).Next(0xc001a4fe10)
* 	github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:127 +0x13e
* github.com/cockroachdb/cockroach/pkg/sql/catalog/schematelemetry/schematelemetrycontroller.updateSchedule({0x5eb3ed8, 0xc010228d20}, 0xc6567c?, {0x5efb720, 0xc00d7db6e0}, 0xc010634000)
* 	github.com/cockroachdb/cockroach/pkg/sql/catalog/schematelemetry/schematelemetrycontroller/pkg/sql/catalog/schematelemetry/schematelemetrycontroller/controller.go:149 +0x266

ajwerner avatar Aug 11 '22 03:08 ajwerner

https://github.com/cockroachdb/cockroach/pull/85945 seems to have fixed it

rafiss avatar Aug 11 '22 18:08 rafiss