pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

[Improve][broker] When calculating resource usage, use getMaxResourceUsageWithWeight instead of getMaxResourceUsage

Open lordcheng10 opened this issue 3 years ago • 11 comments

Motivation

Use getMaxResourceUsageWithWeight instead of getMaxResourceUsage in selectBrokerForAssignment.

Documentation

Check the box below or label this PR directly.

Need to update docs?

  • [ ] doc-required (Your PR needs to update docs and you will update later)

  • [x] doc-not-needed (Please explain why)

  • [ ] doc (Your PR contains doc changes)

  • [ ] doc-complete (Docs have been already added)

lordcheng10 avatar Jul 01 '22 05:07 lordcheng10

@eolivelli @hanbo1990 @Shoothzj PTAL,thanks!

lordcheng10 avatar Jul 01 '22 05:07 lordcheng10

@codelipenghui PTAL,Thanks!

lordcheng10 avatar Jul 01 '22 05:07 lordcheng10

@gaozhangmin PTAL,thanks!

lordcheng10 avatar Jul 01 '22 05:07 lordcheng10

Is this the same with #16281 ?

Jason918 avatar Jul 01 '22 06:07 Jason918

Is this the same with #16281 ?

Afaik, https://github.com/apache/pulsar/pull/16281 is adding a new strategy, whereas this PR is updating the current default one LeastLongTermMessageRate.java

heesung-sohn avatar Jul 01 '22 20:07 heesung-sohn

To me this fix makes sense. However, I am wondering if there was any reason why it uses getMaxResourceUsage() instead of getMaxResourceUsageWithWeight().

The impact of this change could be broad(hopefully in a positive way) for the clusters that configure the resource weights differently.

Nevertheless, LGTM.

heesung-sohn avatar Jul 01 '22 20:07 heesung-sohn

To me this fix makes sense. However, I am wondering if there was any reason why it uses getMaxResourceUsage() instead of getMaxResourceUsageWithWeight().

The impact of this change could be broad(hopefully in a positive way) for the clusters that configure the resource weights differently.

Nevertheless, LGTM.

I agree. The resource usage weight configurations only work in ThresholdShedder strategy currently. Why use getMaxResourceUsage() instead of getMaxResourceUsageWithWeight() in LeastLongTermMessageRate

HQebupt avatar Jul 02 '22 03:07 HQebupt

Both getMaxResourceUsage() and getMaxResourceUsageWithWeight() are only the calculation methods of resource usage, and getMaxResourceUsageWithWeight() is more flexible than getMaxResourceUsage(), so whether it is OverloadShedder or ThresholdShedder, getMaxResourceUsageWithWeight() is more suitable.

I think getMaxResourceUsage() should be discarded, and getMaxResourceUsageWithWeight() should be used uniformly for resource calculation, not only in selectBrokerForAssignment() method.

@HQebupt @heesung-sn @eolivelli @Jason918

lordcheng10 avatar Jul 02 '22 05:07 lordcheng10

IMO, the real problem is the definition of broker load (aka resource usage) is not consistent in bundle shedder and bundle assignment. The load balance module should have a better abstraction that contains both shedding strategy and loading strategy for bundles, so that it can perform a clear balancing target, which can be message rate or cpu load or an abstract resource usage like MaxResourceUsageWithWeight.

@lordcheng10 @heesung-sn @HQebupt @eolivelli

Jason918 avatar Jul 02 '22 07:07 Jason918

The pr had no activity for 30 days, mark with Stale label.

github-actions[bot] avatar Aug 04 '22 02:08 github-actions[bot]

@lordcheng10 @heesung-sn @HQebupt @eolivelli @Jason918 Is there any progress on this PR? If different dimensions are considered when bundle unloading and assigning, there may be frequent balancing leading to large traffic fluctuations. Moreover, the assigned broker has just been unloaded, which will cause unnecessary traffic fluctuations.

keyboardbobo avatar Dec 08 '23 03:12 keyboardbobo