cockroach
cockroach copied to clipboard
kvserver: proxy requests that return a NotLeaseholderError
This handles Partial partitions from the DistSender layer.
Here is an example of how this test is set up
n3
/
n1---n2
n1 is the client and n2 is the leaseholder.
The connection between n1 and n2 is broken
n3
/
n1 n2
For the following request this sequence happens:
n1 -> n2 -> fails as expected - (network cut) n1 -> n1 -> fails with not leaseholder don't proxy since self-client n1 -> n3 -> fails as expected, but proxy to its local dist_sender n3 -> n1 -> fail as expected. leaseholder not known and can't proxy n3 -> n2 -> success
We could optimize out the first leaseholder lookup on n3 by pre-populating its distsender cache before the proxy request, but that requires a bit of a layering violation and isn't necessary for correctness.
Epic: none
Release note: None
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
This is mostly working now. There are a few issues still open related to it:
- Observability. We need metrics that show when proxying is happening. (In progress)
- Changefeeds: These will currently block during a partition and are not easily observable.
- DistSQL: Its unclear if this is a problem. There isn't great testing related to this right now so at a minimum we should add that.
bors r=erikgrinaker
TFTR!