Rafi Shamim
Rafi Shamim
``` "ops": [ "BEGIN", { "sql": "ALTER TYPE schema_w6_74.enum_w19_134 DROP VALUE 'ss'" }, { "sql": "INSERT INTO schema_w4_75.table_w16_187 (\"col18%q7_w16_188\",col187_w16_189,col187_w16_190,col187_w16_191,\"\tcol187_w16_192\") VALUES ((-121):::INT8,3802:::OID,(-3.935225006659739242E+23):::DECIMAL,NULL,e'\\U0004E661\\U0002726F\\U000E4CF4\\U000D4205\\U00095D2E\\U0009B8BE' COLLATE de_DE),(124:::INT8,20:::OID,28136207.90280028999:::DECIMAL,'BOX(-1.4274178705313474 -0.24870956313143466,-0.3637420264055242 0.34687224964684116)':::BOX2D,e'\\U0006121D\\U0006053B\\U000C71C9\\U0006628B' COLLATE de_DE)" } ], "expectedExecErrors":...
``` Error: ***UNEXPECTED ERROR; Failed to generate a random operation: error getting random table name: ERROR: inbox communication error: rpc error: code = Canceled desc = context canceled (SQLSTATE 58C01)...
Closing since this doesn't backport cleanly.
The error is `n3 required, but unavailable`. It happens here, when the upgrades framework uses KV node liveness to check which nodes are up: https://github.com/cockroachdb/cockroach/blob/32622e1b18030bd52529841ec1bb280a5683d5cb/pkg/upgrade/upgradecluster/nodes.go#L31-L62 In the CRDB logs, it...
Is the solution here to improve the `UntilClusterStable` function so it can tolerate temporary unavailability of a node? I'm not sure if that's just a band-aid we should try to...
I saw the same symptom in this CI run: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelEssentialCi/14683931?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildTestsSection=true ``` Failed === RUN TestLogic_upgrade_skip_version test_log_scope.go:170: test logs captured to: /artifacts/tmp/_tmp/d344e0c08e5dac50ac55f38c33032127/logTestLogic_upgrade_skip_version3569230637 test_log_scope.go:81: use -show-logs to present logs inline [16:45:01] ---...
> As the tests are written today, many of them will fail with a single heartbeat failure which is unfortunate. I think the code in upgradeStatus Is too strict today...
looks the same as https://github.com/cockroachdb/cockroach/issues/120521
The 5 most recent failures (edit: and also the one after this comment) are resolved by https://github.com/cockroachdb/cockroach/pull/126453; leaving this open for the original issue.
looks the same as https://github.com/cockroachdb/cockroach/issues/120521