bookkeeper
bookkeeper copied to clipboard
Fix RegionAwarePolicy can not update rackInfo between bookie left and join
Motivation
When using regionAwarePolicy in pulsar, we encounter a case with following step:
- bookie shutdown, trigger handleBookiesThatLeft()
- change bookie's rackInfo, trigger onBookieRackChange()
- bookie start, trigger handleBookiesThatJoined()
However, the rackInfo is not updated actually, broker still write ensemble with old rackInfo.
The reason is in RegionAwareEnsemblePlacementPolicy#getLocalRegion(), it maintain a local variable address2Region, which would put into bookie's region in handleBookiesThatJoined() and onBookieRackChange(). However, if bookie is shutdown and we change rackInfo at the same time, address2Region would not be updated. Thus we would get error rackInfo from address2Region. When the bookie restart, it still get error rackInfo from address2Region in handleBookiesThatJoined()
The relevant code is in: https://github.com/apache/bookkeeper/blob/bf9a5cf3ee457797542701f0f336bede227c7fbd/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RegionAwareEnsemblePlacementPolicy.java#L88-L102
By the way, another situation would also cause this problem:
- broker restart, at the same time bookie shutdown
- during the broker restart process, it can not resolve the bookie's rackInfo. So bookie's rackInfo would become /default-region/default-rack
- bookie restore and start, but its rackInfo is still /default-region/default-rack. rackInfo is not correct.
Changes
- The Hashmap "address2Region" is local variable. It should be updated in handleBookiesThatJoined()
- add a test to verify