grpc-go icon indicating copy to clipboard operation
grpc-go copied to clipboard

ringhash: fix normalizeWeights

Open arvindbr8 opened this issue 1 year ago • 4 comments

RELEASE NOTES:

  • ringhash: fix normalizeWeights in the event of just a weight update

arvindbr8 avatar Apr 19 '24 23:04 arvindbr8

Thanks for the review @atollena and @aranjans.

I still think this diff is not ready to be merged yet. There is still the open question why getWeightAttribute(a) doesnt work in this case, even though (*subConn).weight is set using getWeightAttribute(a). I wanted to check with @zasweq if he knows what's going on.

arvindbr8 avatar Apr 22 '24 16:04 arvindbr8

There's two other maintainers actively looking at this so I'll remove myself.

zasweq avatar Apr 22 '24 17:04 zasweq

Thanks for the review @atollena and @aranjans.

I still think this diff is not ready to be merged yet. There is still the open question why getWeightAttribute(a) doesnt work in this case, even though (*subConn).weight is set using getWeightAttribute(a). I wanted to check with @zasweq if he knows what's going on.

Ha. I thought that you had that figured out. I took a look at the code and, with my limited understanding, it looks like the "bug" is in resolver.AddressMap.Set: https://github.com/grpc/grpc-go/blob/master/resolver/map.go#L81-L90

At L86, when we found that we already had the address in the map, the code only updates the value, not the addr. So if the BalancerAttribute changed, the addr isn't updated. As a result, if the weight changes, which is a BalancerAttribute, then the resolver map does not update the entry with the new address.

From the documentation of resolver.AddressMap, it sounds like resolver.BalancerAttribute should always be ignored, and two addresses with all fields equal except for resolver.BalancerAttribute and Metadata have the same key. But then, it's not super clear whether calling Set twice with two addresses having varying resolver.BalancerAttribute should update the key.

I see two options to fix this:

  1. Make resolver.AddressMap update Metadata and BalancerAttribute when setting the value associated with an address.
  2. Make resolver.AddressMap always clear Metadata and BalancerAttribute before storing the address entry. I think it would be a bit less confusing, and make it clear that Metadata and BalancerAttribute should be stored separately when using an address map, e.g. in the value. This is a breaking change though, since resolver.AddressMap is public (it might not hurt to mark it experimental -- FWIW we have 1 usage of it internally at datadog, for a custom balancer which is a variation of weighted round robin).

I'd sollicit Doug and Zach's input on this because I don't have the context of when address map is useful (looks like it's mostly to find existing subconn associated with an address in O(1)?), but option 1 seems more realistic.

atollena avatar Apr 23 '24 08:04 atollena

If we are open to making a breaking change, another option to consider, in addition to those listed in https://github.com/grpc/grpc-go/pull/7156#issuecomment-2071761929 :

  1. Introduce a new struct type to serve as the key for AddressMap.m, call it MapKey. MapKey will contain only the Addr and ServerName strings as members. Change the AddressMap.Keys() method to return objects of MapKey. This way the compiler ensures that callers access only those fields of the key that are set.

We could also achieve the same by first deprecating the existing Keys method and introducing a new method with the behaviour described above.

arjan-bal avatar Apr 29 '24 07:04 arjan-bal

From the documentation of resolver.AddressMap, it sounds like resolver.BalancerAttribute should always be ignored, and two addresses with all fields equal except for resolver.BalancerAttribute and Metadata have the same key. But then, it's not super clear whether calling Set twice with two addresses having varying resolver.BalancerAttribute should update the key.

what the docstring on resolver.AddressMap really talks about is that we should not be relying on the balancerAttributes in the key, but use value instead. And in ringHash, the value is of type ringhash.subConn which actually contains the weight information.

The actual bug is with using the rignhash.getWeightAttribute helper. This is helpful when getting the balancerattribute weight when updating addresses. But is incorrect when used to get weights from the AddressMap. Instead, we should type assert to ringhash.subConn and read weights from that instead.

cc'ing: @easwars; since he has most context on ringhash.

arvindbr8 avatar May 14 '24 18:05 arvindbr8

Testing / tests (tests, 1.22, 386) (pull_request)

Something is wrong with this presubmit check. Will need to investigate.

Merging since this change is not expected to break 386 variant of go1.22 test

arvindbr8 avatar May 29 '24 23:05 arvindbr8