automaxprocs icon indicating copy to clipboard operation
automaxprocs copied to clipboard

replace quota floor with ceil

Open to6ka opened this issue 2 years ago • 10 comments
trafficstars

hi, i have the case with quota 2.9 so i think the proper way is to set max procs to 3, instead of 2

to6ka avatar May 10 '23 16:05 to6ka

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar May 10 '23 16:05 CLAassistant

anybody?

to6ka avatar May 16 '23 11:05 to6ka

Hey @to6ka, I'm not a maintainer, but:

  • This would be quite a significant change to behavior, and could be considered a breaking change.
  • This was intentionally changed in the past. There's some discussion in #13.

I suspect the maintainers would be more open to an option to supply a custom func(float64) float64 to replace math.Floor for specific use cases.

abhinav avatar May 16 '23 17:05 abhinav

I suspect the maintainers would be more open to an option to supply a custom func(float64) float64 to replace math.Floor for specific use cases.

Yes. By default we should keep floor, but we're open to exposing an option that can change this behavior.

sywhang avatar May 16 '23 18:05 sywhang

Thanks for your feedback. This is surprising information for me)

As i see this change is really old, It may be too late for discussion, but i have 2 points against it:

  • leaving rest cpu for other demands looks like undocumented and unconfigurable feature for this moment.
  • As just a user of a public library, i'm expecting, that it allows me to utilize the maximum of my resources

At now i have situation with service having quota 2.9 in kubernetes. This means, that process is utilizing 2 cpu, but also this does not means that process have enough cpu. It may throttling really, but by golang scheduler, not system scheduler. This situation could be catched by monitoring possibly, but there is no system throttling, and process is utilizing only 70% of quota formally. So it is uncatchable by resource monitoring, and i think this may be bad.

I just want to say that this behavior is not obvious and it might be worth revisiting.

to6ka avatar May 16 '23 22:05 to6ka

Thanks for the feedback.

leaving rest cpu for other demands looks like undocumented and unconfigurable feature for this moment.

Agreed that this can be confusing. I can add docs on the default behavior to keep things more clear. Also we're open to introducing options to make this configurable as I've mentioned previously.

As just a user of a public library, i'm expecting, that it allows me to utilize the maximum of my resources One could argue that a worse situation is when the utilization goes above the expected threshold.

Consider a situation where you have three processes that makes use of quota, say at 1.6. As a user, you expect them to take up "4.8" cores. But if ceil is used, we will end up allocating 2 cores each, and end up with 6 total cores, which exceeds the expected threshold by a whole core.

sywhang avatar May 17 '23 17:05 sywhang

Consider a situation where you have three processes that makes use of quota, say at 1.6. As a user, you expect them to take up "4.8" cores. But if ceil is used, we will end up allocating 2 cores each, and end up with 6 total cores, which exceeds the expected threshold by a whole core.

It will try to use 2 cores, but will not succeed, because of system scheduler and quota, and it will be thrrottled finally. So this logic is trying to save me from using overcpu and throttling, but i dont understand, why. If im setting quota 1.6, may be i possibly want to utilize 1 + 0.6 core? And what if we would take another case with quota 0.6 ?

I thought all libs tuned for maximum throughput by defaullt. And i am expecting this. If you need something else - you are making additional moves, using options, etc. This one is tuned for best latency (may be) and some other not obvious demands.

This situation is really weird for me: lib is designed to better fit my app with my resources, and it prevents me from using resources.

Also we're open to introducing options to make this configurable as I've mentioned previously.

option is ok, but it is changing the whole way of using the lib. And as i mentioned earlier, my vision of default behavior is a maximum throughput. So potentially i need to replace using lib everywhere =/

to6ka avatar May 18 '23 12:05 to6ka

Also my case was mentioned in original thread)

I understand that nobody wants to change default behavior at this point. So i will try to add an option, this i better than nothing.

Thank you for this discussion)

to6ka avatar May 18 '23 12:05 to6ka

@to6ka Thanks for having a look at this issue. Any updates on adding an option to use ceil? One problem with floor wasn't mentioned before: Using floor in conjunction with Kubernetes autoscaling can have severe consequences. If the autoscaler is configured to trigger a scale-up e.g. if CPU load exceeds 70% and the current limit is 1.9 CPU, the floor function will prevent a scale-up since 70% will never be reached. As a result, the whole service might get unstable.

Garfield96 avatar Jul 15 '23 12:07 Garfield96

Hi, i have some code, but unfortunately i was busy to end it This change requires some refactoring and i dont know if maintainers will like it :D

Also new usage way is more complicated than just simple import, and most people will use old way or default options with old behavior, which is buggy as you know So now im doubted would this option be useful or not

But i will try to end it soon

Offtop: to overcome this problem with old way usage, i think the best way it would be to make lib v2 and make old way of usage (with import) - deprecated. So it will make people use explicit initialization. Also i think it is better to use ceil in v2 by default.

to6ka avatar Jul 17 '23 11:07 to6ka