pulsar
pulsar copied to clipboard
[improve][broker]improve the getAntiAffinityNamespaceOwnedBrokers check exclude the current namespace cause count add.
Motivation
As the PIP 7: Pulsar Failure domain and Anti affinity namespaces design, the anti affinity namespaces should be distributed to evenly across all domain and all the brokers.
But in method getAntiAffinityNamespaceOwnedBrokers
, brokerToAntiAffinityNamespaceCount
add a count even the namespace equal the given bundle's namespace which will be load. The behavior will cause the namespace easy to distributed to
a broker which has another namespace in anti affinity group, the behavior broke the anti affinity balance. It's better behavior is the same namespace should be distributed to those broker which has loaded the same namespace when all broker have load at least one namespace in ti affinity group.
there is some case ns-0 ns-1 ns-2 are all set the same anti affinity group like 'a-group
broker-0
own ns-0 bundle-0
;
broker-1
own ns-1 bundle-0
;
broker-2
own ns-2 bundle-0
;
then another ns-2 bundle-1
need choice a broker to load. As the old policy, broker-0 broker-1 and broker-2
are all satisfy the least NamespaceCount; but if ns-2 bundle-1
load to broker-0 or broker-1
will broke the anti affinity balance. ns-2 bundle-1
need be load by broker-2 is better.
Also, the behavior will cause give up doLoadShedding
when the all broker own one namespace in anti affinity group, but the different broker owned namespace bundle count and payload is different.
So brokerToAntiAffinityNamespaceCount
should exclude the given namespace count add.
Modifications
- In LoadManagerShared.class
getAntiAffinityNamespaceOwnedBrokers
method, exclude the given namespace count add. - changed
shouldAntiAffinityNamespaceUnload
check; - add some unit test.
Documentation
- [X]
doc-not-needed
Matching PR in forked repository
PR in forked repository: https://github.com/Nicklee007/pulsar/pull/5
@codelipenghui @Jason918 @eolivelli Could you help to review this PR, Thx.
there is some case
broker-0
ownns-0 bundle-0
;broker-1
ownns-1 bundle-0
;broker-2
ownns-2 bundle-0
; then anotherns-2 bundle-1
need choice a broker to load. As the old policy,broker-0 broker-1 and broker-2
are all satisfy the least NamespaceCount; but ifns-2 bundle-1
load tobroker-0 or broker-1
will broke the anti affinity balance.ns-2 bundle-1
need be load by broker-2 is better.
What's the anti affinity setting in this case?
What's the anti affinity setting in this case?
@Jason918 ns-0 ns-1 ns-2 are all set the same anti affinity group like 'a-group', and the three ns be anti affinity load is expected.
/pulsarbot run-failure-checks
The pr had no activity for 30 days, mark with Stale label.
/pulsarbot run-failure-checks
Codecov Report
Merging #16563 (118b16b) into master (b05fddb) will decrease coverage by
14.98%
. The diff coverage is0.00%
.
@@ Coverage Diff @@
## master #16563 +/- ##
=============================================
- Coverage 45.64% 30.66% -14.99%
+ Complexity 11043 6016 -5027
=============================================
Files 773 633 -140
Lines 74463 59892 -14571
Branches 8018 6241 -1777
=============================================
- Hits 33986 18363 -15623
- Misses 36687 38973 +2286
+ Partials 3790 2556 -1234
Flag | Coverage Δ | |
---|---|---|
unittests | 30.66% <0.00%> (-14.99%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
...sar/broker/loadbalance/impl/LoadManagerShared.java | 39.39% <0.00%> (-4.03%) |
:arrow_down: |
...n/java/org/apache/pulsar/client/api/RawReader.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
...ava/org/apache/pulsar/broker/admin/v1/Brokers.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
...ava/org/apache/pulsar/broker/admin/v2/Brokers.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
...va/org/apache/pulsar/broker/admin/v1/Clusters.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
.../org/apache/pulsar/broker/admin/v1/Properties.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
.../org/apache/pulsar/client/impl/RawMessageImpl.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
.../apache/pulsar/broker/admin/v2/ResourceGroups.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
...ar/common/naming/PartitionedManagedLedgerInfo.java | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
...e/pulsar/broker/admin/impl/ResourceQuotasBase.java | 0.00% <0.00%> (-96.43%) |
:arrow_down: |
... and 308 more |
/pulsarbot run-failure-checks
The pr had no activity for 30 days, mark with Stale label.