bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Incorrect estimate of available memory with --experimental_local_memory_estimate

Open andreasg123 opened this issue 4 years ago • 5 comments

Description of the problem / feature request:

--experimental_local_memory_estimate seems to produce an incorrect estimate of available memory, causing builds to be delayed by many minutes.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

When checking where --experimental_local_memory_estimate is evaluated, it assigns

double reserveMemory = staticResources.getMemoryMb();

and then computes

remainingRam = totalFreeRam - reserveMemory;

As staticResources appears to be a copy of availableResources (both are set in setAvailableResources; availableResources is public but not manipulated elsewhere as far as I can tell), I believe that the computation should be this:

remainingRam = Math.min(remainingRam, totalFreeRam);

If my interpretation is correct, assigning more memory to Bazel would actually leave less remaining memory because all the assigned memory is being treated as reserved elsewhere.

https://github.com/bazelbuild/bazel/blob/fa9fabb3f699f1e8e8ebbe22228fc4acbc9df2ba/src/main/java/com/google/devtools/build/lib/actions/ResourceManager.java#L389-L392

What operating system are you running Bazel on?

LSB Version:	:core-4.1-amd64:core-4.1-noarch
Distributor ID:	Amazon
Description:	Amazon Linux release 2 (Karoo)
Release:	2
Codename:	Karoo

What's the output of bazel info release?

INFO: Invocation ID: 2a8b8b1f-918a-4d50-bcfb-fdf655a6dabe
release 4.0.0

andreasg123 avatar Mar 13 '21 04:03 andreasg123

I would probably use the following to reduce the chance of out-of-memory errors:

remainingRam = Math.min(remainingRam, 0.95 * totalFreeRam);

andreasg123 avatar Mar 13 '21 04:03 andreasg123

@ulfjack I think you added this flag. Any thoughts?

aiuto avatar Mar 31 '21 04:03 aiuto

Possibly the bug was introduced in this commit @susinmotion ?

https://github.com/bazelbuild/bazel/commit/7f530c88cf3b66b7ff0c585f1cedf401e01138f0#diff-95029ccab6cd5d02bb23eb7c648ed9b57da4f02f21dc95a903aed959a28754caR390

-        double reserveMemory =
-            staticResources.getMemoryMb() * (100.0 - this.ramUtilizationPercentage) / 100.0;
+        double reserveMemory = staticResources.getMemoryMb();
         remainingRam = totalFreeRam - reserveMemory; 

I think the previous logic tried to achieve something like:

  memoryNotAvailableToBazel = totalMachineMemory - staticResources.getMemoryMb()
  remainingRam = totalFreeRam - memoryNotAvailableToBazel

glukasiknuro avatar Mar 10 '22 01:03 glukasiknuro

Thank you a lot for these notices! I will change a logic in this code to remove the old incorrect code.

wilwell avatar Jun 28 '22 14:06 wilwell

Per this thread in the bazel slack, it appears that --experimental_local_memory_estimate causes Bazel to execute no more than one local action at a time (as observed with ps). Could the incorrect estimate reported here be responsible for this behavior?

dws avatar Dec 18 '22 08:12 dws

FYI, 7.0.0-pre contains https://github.com/bazelbuild/bazel/commit/361ce673ad2b959b73e859a121ff2e996feb561b which removes this option altogether

mafanasyev-tri avatar Oct 06 '23 18:10 mafanasyev-tri