Tobias Grieger
Tobias Grieger
Had to add some preliminary cleanup commits to address data races - the correct use of tenant metrics is notoriously tricky because the metrics must not be used once the...
Tftr! bors r+
O-support for https://github.com/cockroachlabs/support/issues/3342#issuecomment-2983907045
This got done in https://github.com/cockroachdb/cockroach/pull/147440, insure why escalate disagrees.
Are we still collecting? Here's one from https://github.com/cockroachdb/cockroach/issues/129980 Details ``` W240902 08:54:42.299319 13191 kv/kvserver/intentresolver/intent_resolver.go:1038 ⋮ [-] 214 test-only warning: if you see this, please report to https://github.com/cockroachdb/cockroach/issues/112680. empty admission header...
This is a duplicate of https://github.com/cockroachdb/cockroach/issues/141912#issuecomment-2688996459. Applying the corresponding labels and marking as duplicate.
In the failing run, we get this: ``` I250618 08:27:51.238514 87 kv/kvserver_test/flow_control_integration_test.go:3780 [-] 452 -- (Issuing 1x1MiB, 4x replicated write (w/ one non-voter) that's not admitted. I250618 08:27:51.238808 2709 kv/kvserver/kvflowcontrol/rac2/token_counter.go:682...
I added this custom logging ``` I250618 09:09:08.791748 2733 kv/kvserver/kvflowcontrol/rac2/range_controller.go:2885 [T1,Vsystem,n1,s1,r75/1:/{Table/Max-Max},raft] 460 using low priority for send queue entry in pull mode ``` here: https://github.com/cockroachdb/cockroach/blob/93519aef3385e7dca7b2cda8c528e6fa3027f6e9/pkg/kv/kvserver/kvflowcontrol/rac2/range_controller.go#L2883-L2885 and it fires in the...
[DD tsdump here](https://us5.datadoghq.com/dashboard/bif-kwe-gx2/self-hosted-db-console-tsdump?fromUser=true&refresh_mode=paused&tpl_var_cluster=tbg-20250620-iss148337&tpl_var_upload_day=20&tpl_var_upload_id=tbg-20250620-iss148337-20250620092035&tpl_var_upload_month=6&tpl_var_upload_year=2025&from_ts=1749910109215&to_ts=1749922782339&live=false) but [I get lots of bogus-looking data](https://cockroachlabs.slack.com/archives/C063CP41TG9/p1750417963960139?thread_ts=1750417237.520119&cid=C063CP41TG9) so using Grafana Looks like just after n12 comes back after scheduled downtime, n5 has a few (maybe just one)...
n12 is expected to be overloaded since it's coming back from 10 min of downtime, but the problem that breaks the test is slow pebble on n2. The load listener...