valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Pass extensions to node if extension processing is handled by it

Open hpatro opened this issue 1 year ago • 22 comments
trafficstars

Ref: https://github.com/redis/redis/pull/12760

Description

Fixes compatibilty of PlaceholderKV cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) .

With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message.

This fix introduces a flag extensions_supported on the clusterMsg indicating the sender node can handle extensions parsing. Once, a receiver nodes receives a message with this flag set to 1, it would update clusterNode new field extensions_supported and start sending out extensions if it has any.

This PR also introduces a DEBUG sub command to enable/disable cluster message extensions process-clustermsg-extensions feature.

Note: A successful PING/PONG is required as a sender for a given node to be marked as extensions_supported and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s).

Testing

TCL test verifying the cluster state is healthy irrespective of enabling/disabling cluster message extensions feature.

hpatro avatar Mar 27 '24 19:03 hpatro

One of the test fails in few of the runs.

[err]: Verify the nodes configured with prefer hostname only show hostname for new nodes in tests/unit/cluster/hostnames.tcl
Expected '' to be equal to 'shard-1.com' (context: type eval line 60 cmd {assert_equal [lindex [get_slot_field $slot_result 0 2 3] 1] "shard-1.com"} proc ::test)

Taking a look.

hpatro avatar Mar 27 '24 21:03 hpatro

Regarding the flaky test. There is a slight behavior change with this PR.

A successful PING/PONG is required as a sender for a given node to be marked as extensions_supported and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). (Added it to the top level comment as well)

hpatro avatar Mar 27 '24 22:03 hpatro

A successful PING/PONG is required as a sender for a given node to be marked as extensions_supported and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). (Added it to the top level comment as well)

So this happens in two flows:

  1. The initial meet flow, where a new node to a cluster has to omit extensions since it doesn't know if the target supports them. After the target of the meet receives the message, it will show the node without extensions (not great) until it get's a followup pong message. We could solve this by omitting a node from cluster output (if it sent us a message with the extension bit set).
  2. When gossiping, other nodes will learn about the new node, and they also send it a message without extensions. This results as the same problem as one, but is easier to resolve since we can gossip if a node supports extensions. There ate 16 bits of flags (of which only 10 are being used) on the node flags that

So we really just need a goos solution for 1.

madolson avatar Mar 31 '24 04:03 madolson

@PingXie @madolson I'm addressing the comments on the PR. Will get back in couple of days.

hpatro avatar Apr 04 '24 21:04 hpatro

@zuiderkwast Unsure about what you meant by assigning it to Ping for this PR, did you meant it for review or something else ?

hpatro avatar Apr 04 '24 21:04 hpatro

@zuiderkwast Unsure about what you meant by assigning it to Ping for this PR, did you meant it for review or something else ?

@hpatro please continue with this PR. I will be the "designated" reviewer :-)

PingXie avatar Apr 05 '24 01:04 PingXie

Yeah, I meant as the reviewer. Sorry.

zuiderkwast avatar Apr 05 '24 11:04 zuiderkwast

Question: Shall we include this in our 7.2.4 release?

Although it improves compatibility with older Redis versions, it's a difference compared to Redis 7.2.4. We should mark this clearly in our release notes then.

zuiderkwast avatar Apr 05 '24 11:04 zuiderkwast

Shall we include this in our 7.2.4 release?

I think we should try our best to make it happen. It is an important and fully compatible fix.

PingXie avatar Apr 05 '24 13:04 PingXie

@hpatro please continue with this PR. I will be the "designated" reviewer :-)

Reminds me of the fact we should add code owner when we get this all sorted.

madolson avatar Apr 05 '24 20:04 madolson

I think we should try our best to make it happen. It is an important and fully compatible fix.

Yeah, I was pushing for this because I think it's important for compatibility. I would be okay with this in like a 7.2.5 though, it's important to have eventually.

madolson avatar Apr 05 '24 20:04 madolson

I think I can submit a PR today. And get it done by Monday if you all can help review it @madolson @PingXie . It would be nice to have this in our first release itself.

hpatro avatar Apr 05 '24 20:04 hpatro

Changes done:

  1. Introduce new node flag CLUSTER_NODE_EXTENSIONS_SUPPORTED to mark if the node can handle extension(s) data based on CLUSTERMSG_FLAG0_EXT_DATA cluster message header flag.
  2. Send extensions data only to the node which can handle extensions based on pt. 1.
  3. CLUSTERMSG_FLAG0_EXT_DATA header flag would be sent irrespective of extensions present or not. This helps indicate if the node supports extensions feature or not.
  4. Introduce debug flag to enable/disable sending extensions data. Receiving extensions is still enabled.

@PingXie @madolson Let me know if you agree with this.

@madolson Regarding your concern of displaying IP address incorrectly for a period of time, I think we should decouple it from this issue. And I think that issue was valid in 7.0 itself. Can discuss further.

hpatro avatar Apr 05 '24 22:04 hpatro

@PingXie Thanks for your feedback.

I initially started off this PR (in Redis) with similar setup you suggested (fixing the issue and manually tested it). The fake testing approach seems to bring in a bit of complexity and albeit it wasn't perfect (tried to get it out in a hurry). I've trimmed down the code, please have a look now.

hpatro avatar Apr 06 '24 03:04 hpatro

Tested few things

Setup

  • 2 cluster (1 primary/1 replica) each
  • Shard 1: Engine Redis 6.2
  • Shard 2: Engine Valkey latest with patch

Operations

  • MEET between the hybrid cluster - Cluster was stable during the operation and slot information got propagated correctly between the clusters.
  • Replaced primary of Redis 6.2 with Valkey latest, cluster/node remained healthy after the replacement.
  • Added a Redis 7.2 node (without the fix), shows up as fail for the Redis 6.2 and remains connected to the Valkey latest node.

hpatro avatar Apr 06 '24 04:04 hpatro

@madolson Regarding your concern of displaying IP address incorrectly for a period of time, I think we should decouple it from this issue. And I think that issue was valid in 7.0 itself. Can discuss further.

Why do you think this? I think this is a regression if you have hostnames enabled and have a new node join a cluster, as for a brief period of time the nodes will incorrect show up as not support hostnames (even though they do). We've seen issues in the past where clients don't handle this correctly, and some clients see prolonged periods without being able to resolve it. My concern is that it affects new nodes added during steady state, and not just the upgrade case.

madolson avatar Apr 07 '24 02:04 madolson

@madolson Regarding your concern of displaying IP address incorrectly for a period of time, I think we should decouple it from this issue. And I think that issue was valid in 7.0 itself. Can discuss further.

Why do you think this? I think this is a regression if you have hostnames enabled and have a new node join a cluster, as for a brief period of time the nodes will incorrect show up as not support hostnames (even though they do). We've seen issues in the past where clients don't handle this correctly, and some clients see prolonged periods without being able to resolve it. My concern is that it affects new nodes added during steady state, and not just the upgrade case.

Exactly, this issue exists for a new node being added and not related to the upgrade scenario. So, it could happen for any cluster running with hostnames feature. Hence, I don't want to couple with this particular PR.

hpatro avatar Apr 07 '24 05:04 hpatro

Exactly, this issue exists for a new node being added and not related to the upgrade scenario. So, it could happen for any cluster running with hostnames feature.

Today, the hostname feature blindly sends the extension when it joins a new cluster, assuming it supports it. Now, with this change, we will wait until we have confirmation it supports the hostname. This is different behavior no?

madolson avatar Apr 07 '24 05:04 madolson

Exactly, this issue exists for a new node being added and not related to the upgrade scenario. So, it could happen for any cluster running with hostnames feature.

Today, the hostname feature blindly sends the extension when it joins a new cluster, assuming it supports it. Now, with this change, we will wait until we have confirmation it supports the hostname. This is different behavior no?

Won't that node be added via gossip as well and in the gossip information, hostname won't be present. Am I missing something? 🙈

hpatro avatar Apr 07 '24 05:04 hpatro

Won't that node be added via gossip as well and in the gossip information, hostname won't be present. Am I missing something? 🙈

That was one of the two cases we discussed, but the case I'm talking about is for meet messages. The first message a node sends to a new cluster will not include its hostname. (Maybe I'm indexing too much on this case though).

madolson avatar Apr 07 '24 05:04 madolson

Exactly, this issue exists for a new node being added and not related to the upgrade scenario. So, it could happen for any cluster running with hostnames feature.

Today, the hostname feature blindly sends the extension when it joins a new cluster, assuming it supports it. Now, with this change, we will wait until we have confirmation it supports the hostname. This is different behavior no?

Why is this a problem? The cluster bus is an eventual consistency system, and the information broadcast via the cluster bus including "hostname" is inherently dynamic/non-constant. so conceptually speaking this behavior change is within the design premise.

PingXie avatar Apr 07 '24 06:04 PingXie

The cluster bus is an eventual consistency system, and the information broadcast via the cluster bus including "hostname" is inherently dynamic/non-constant.

Although generally this is true, in practice today if you set all the hostnames at startup a node will never observe a node that doesn't have a hostname. This change introduces a small window where that isn't true, and it could show it to clients and a user might see errors.

I suppose I need to be prescriptive. It seems like everyone wants something simple, so I can lean into that. I would rather hostnames continue to work as they do, with nodes not checking to see if the other node supports it.

madolson avatar Apr 07 '24 06:04 madolson

Besides my comment I was OK with the design, so would rather have this in the RC candidate. Folks can comment.

Will file an issue explaining the scenarios better and we can decide over there.

hpatro avatar Apr 09 '24 16:04 hpatro

if you set all the hostnames at startup a node will never observe a node that doesn't have a hostname. This change introduces a small window where that isn't true, and it could show it to clients and a user might see errors.

First and foremost, I am not sure about the importance/value of this property of "a node will never observe a node that doesn't have a hostname". This already can happen during rolling upgrades from pre 7.0 builds to 7.2 and all nodes will have to handle it.

Furthermore, the propagation of many other node attributes don't have this property either today. Even within the same node, the cluster and the replication modules might have a different view on the "primary-replica" relationship too (for a very short period of time after cluster replicate).

My point is that, with the current cluster design, the operator would have to perform an explicit synchronization regardless (think of wait_for_cluster_propagation). So a one-off effort to provide this property (of "a node will never observe a node that doesn't have a hostname") is not going to change the landscape much but with the downside of the excessive complexity (to a part of the system where we already are grappling with reliability issues).

PingXie avatar Apr 11 '24 15:04 PingXie

First and foremost, I am not sure about the importance/value of this property of "a node will never observe a node that doesn't have a hostname". This already can happen during rolling upgrades from pre 7.0 builds to 7.2 and all nodes will have to handle it.

The important point here is a node can't have a hostname pre 7.0, it was an opt-in feature.

madolson avatar Apr 11 '24 23:04 madolson