bazel Option to Control Memory Consumption estimates

This change allows us to control individual actions via their mneonics. We notice that in large codebases, the volatility of estimates is extremely high for post-build operations such as dexing.

Why do we not put the estimate directly in the action itself? Why do we need to override it in resource manager?

If you look up the usage of .setMneonic(), you will see that they all exist in packages and classes and not every class has access to the options.

It seems very complicated to give every place that spawnActionBuilder gets created and used access to the override options. The fix becomes simpler when we call it before resourceManager gets invoked.

You are also not guaranteed where/when resoucebuilder gets called in the future. This can cause additional breakage in the option if the developer forgets to write it there.

Other Issues/Request For Feedback

I am not sure about sticking a parsing operation inside of ResourceManager like I currently am. This seems extremely odd. But options itself does not have exposure to ResourceSet.

This is to fufill feature request: https://github.com/bazelbuild/bazel/issues/15668

Sample usage: build:devpod --mnemonic_resource_override=KotlinFoldJarsAbi=2200,4

Jun 17 '22 18:06 EdbertChan

Hello @EdbertChan, Can you please check build failures. Thank you!

Jun 27 '22 07:06 sgowroji

@sgowroji Build has passed!

Jun 28 '22 22:06 EdbertChan

@sgowroji Please let me know if/when you'd like to have this issue landed. We have a followup diff on our side that will allow for HOST_CPUS and HOST_RAM to be used as well, not just integers. However, the size of the diff will grow quite a bit and I'd prefer to have smaller diffs land.

Jul 05 '22 21:07 EdbertChan

Hello @EdbertChan, Thank you for being patience. We have shared the PR with our team for for further check and review. We will let you know the update very soon.

Jul 07 '22 07:07 sgowroji

I agree that the estimates are not good in general and we need more ways to control them. Your PR looks reasonably good, but we'll have to consider if we want to expose this interface. Yes, it would allow you to work around your current issues with some manual experimentation and configuration, but we would not want all users to have to do that.

Since commit https://github.com/bazelbuild/bazel/commit/d7f0724b6b91b6c57039a1634ff00ccebd872714, it is possible to set the expected resource usage in the rule definition, using the resource_set parameter. For dexing, we obviously still would have to update the Java code, but it has the same functionality available.

Most importantly, with the new functionality (whether in StarLark or Java), it is possible for adjust the estimates based on the number of inputs, which your code does not allow. The memory consumption in particular tends to correlate with the number of inputs, up to a certain level. Using a fixed resource value would always under- or over-estimate a number of actions badly.

Jul 18 '22 13:07 larsrc-google

@EdbertChan Any thoughts on my comment above?

Dec 14 '22 13:12 larsrc-google

@larsrc-google

Sorry for the delay. It looks like the diff you posted gives actions the power to set the amount of resources they will use via "resource_set".

Correct me if I'm wrong but does that not mean that we will have to wait for some of these rules (ie. Kotlin) to implement this new "resource_set" parameter before we reap the benefit just like you have in ProtoCompileActionBuilder? What is the forcing function for doing this?
Since we're dealing with the tail risk of apps that are compiling codebases big enough to exceed the default memory limits, my gut feeling is that it is a mistake to assume a linear distribution and scaling. Off the top of my head, because of generated code + dependency injection, it is more likely that an app at the root of the dependency graph will have exponentially more inputs and exponentially more targets, thus exponentially more size. This can be seen by the fact that our dexing operations take 4GB+.

Give me some time to generate the data for this 2nd point. Thank you for giving this diff attention.

Feb 02 '23 03:02 EdbertChan

ad 1: True, there is no forcing function, but there is the carrot that more accurate estimates would allow better use of local resources - faster builds when the estimate is lower, less memory consumption when the estimate is higher. I do believe all rule maintainers want faster builds, though they need to be aware of this possibility.

ad 2: There's no assumption of linearity, you could have a function that estimates 1MB for odd numbers of inputs and 20G for even numbers, if you really wanted (I knew learning about strange integrals would come in handy some day:). If it really does have exponential behaviour, you have bigger issues than good estimates. I would expect nothing higher than quadratic. So far, what we've seen has been pretty close to linear.

Feb 02 '23 09:02 larsrc-google

@larsrc-google Should we include this to release-7.0.0? Also, I don't think this was even merged to master yet.

cc: @Wyverald, @meteorcloudy, @bazelbuild/triage

Oct 30 '23 21:10 iancha1992

No, I don't believe this is a good approach. @wilwell is the person who works on this now, though.

Oct 31 '23 09:10 larsrc-google

this shouldn't go immediately into release-6.4.0 anyway; commits need to go into master first, and 6.4.0 is already released.

Oct 31 '23 17:10 Wyverald

bazel bazel copied to clipboard

Option to Control Memory Consumption estimates

bazel
bazel copied to clipboard