valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Replicas should indicate their version to their Primaries during handshake

Open madolson opened this issue 1 year ago • 10 comments

I've seen it pop up in a few places where it be ideal to know what the version of the replica is we are replicating too, to potentially send something it might understand (and be potentially more compatible). I would suggest we extend the REPLCONF command to also support configuring the version metadata. Something like:

REPLCONF version 7.0.5

The replica would optionally send this after the other capabilities, so that it could gracefully handle the case where the primary does not support it.

I thought about this after seeing, https://github.com/valkey-io/valkey/pull/245/files#r1586619641 (moved to follow-up issue #421), which would allow us to abort if a replica isn't on the same version as us.

madolson avatar May 02 '24 00:05 madolson

Hi @madolson. I want to take a look at this issue. I'm not exactly sure if I have to be assigned to an issue before I can work on it, or if I can go ahead and find a fix

adetunjii avatar May 02 '24 00:05 adetunjii

@adetunjii I'm okay with you working on it, but we still need to get consensus first we want to build it.

madolson avatar May 02 '24 01:05 madolson

I do not object to add this argument/value pair. But this internal command could be called by replica and master. Thus I want to suggest if replica can call this command with format "REPLCONF version version-number", and master could call this command as "REPLCONF version" and get replica version number as well.

hwware avatar May 02 '24 19:05 hwware

@valkey-io/core-team Thoughts on this? Not asking for a specific vote, but feel free to 👍 or 👎 .

madolson avatar May 03 '24 18:05 madolson

In general, I think it's better to report capabilities.

Feature negotiation between two peers = exchange capabilities flags.

For #245 (or its follow-up #421) I think we can use the cluster bus instead. We can just pick an unused bit in clusterMsg as a capability flag.

zuiderkwast avatar May 03 '24 18:05 zuiderkwast

Feature negotiation between two peers = exchange capabilities flags.

This doesn't really solve the problem unless we are going to send every single forward compatibility breaking change as a capability.

madolson avatar May 03 '24 18:05 madolson

I assume the proposed version scheme is meant to track the compatibility of the wire protocol (in terms of new commands/flags/etc)?

I am generally aligned with the version scheme. We can discuss the details over the PR.

Also generally speaking, I don't think a granular compat negotiation protocol like "capability bits" would scale.

PingXie avatar May 03 '24 19:05 PingXie

Feature negotiation between two peers = exchange capabilities flags.

This doesn't really solve the problem unless we are going to send every single breaking change as a capability.

Capability flag for each new feature is what I had in mind. Version check works too. I don't have a strong opinion.

zuiderkwast avatar May 04 '24 00:05 zuiderkwast

I like this idea, primary and replicas should exchange their version. But I don't think we should abort the replication if the replica's version is not same with primary, that would block upgrade and rollback via replication.

soloestoy avatar May 16 '24 07:05 soloestoy

Version might be good to have for INFO fields or to use as capas for e.g. new types or such. But there are more aspects than version, like config and modules, that affect what a replica is capable of.

For the #421 issue, what the primary needs to know is that the replica can accept replication of CLUSTER SETSLOT. I believe this should only be enabled if the replica is >= 8.0 and in cluster mode. So I'm thinking that maybe a capa is needed, or that we use a bit in the cluster bus to signal this capa.

Is it possible for a replica to be in non-cluster node and replicate from a cluster primary? (I'm thinking about fake-replicas like valkey-cli and others special scenarios.)

zuiderkwast avatar May 16 '24 08:05 zuiderkwast

It seems we have consensus (or at least majority) on this.

I opened a PR.

I do not object to add this argument/value pair. But this internal command could be called by replica and master. Thus I want to suggest if replica can call this command with format "REPLCONF version version-number", and master could call this command as "REPLCONF version" and get replica version number as well.

@hwware The replica sends all its info using REPLCONF during handshake (listening-port, ip-address, capa). There is no need for the primary to ask the replica for this later. It seems logical to do the same with REPLCONF VERSION x.y.z.

zuiderkwast avatar May 26 '24 12:05 zuiderkwast