rules_kotlin icon indicating copy to clipboard operation
rules_kotlin copied to clipboard

Add `-XX:-MaxFDLimit` to builder binaries to allow it use system open files limit

Open arunkumar9t2 opened this issue 3 years ago • 6 comments

When compiling large modules we noticed KotlinKapt would fail with Too many open files on MacOS.

We increased the shell limit manually from default 256 to 65536. Verified by running ulimit -a

-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8176
-c: core file size (blocks)         0
-v: address space (kbytes)          unlimited
-l: locked-in-memory size (kbytes)  unlimited
-u: processes                       5333
-n: file descriptors                65536

Even then the worker action failed, it seems that JVM has its own FD limit which can be changed to use the system/shell limits via -XX:-MaxFDLimit arg. Adding these to java_binary targets worked for us.

References:

  • https://wilsonmar.github.io/maximum-limits/
  • https://stackoverflow.com/a/33838568
  • https://github.com/gradle/gradle/issues/17274
  • https://github.com/bazelbuild/bazel/issues/15278

arunkumar9t2 avatar Aug 06 '22 10:08 arunkumar9t2

@oliviernotteghem is this the same fix that you mentioned using last week?

Bencodes avatar Aug 07 '22 20:08 Bencodes

Does this work with Java 8? The flags changed, which has been problematic.

restingbull avatar Aug 07 '22 21:08 restingbull

Based on a cursory look, it does according to here https://gist.github.com/ndimiduk/a6c2aa781c20fb8bb9c20abbcf5bac4f

arunkumar9t2 avatar Aug 07 '22 21:08 arunkumar9t2

@Bencodes : this is actually the approach @nkoroste discussed. We worked around the issue at Uber by pruning transitives deps (which causes the # of file opened to decrease, since # of jar of the classpath is much smaller). Our initial fix was to limit the # of parallel jobs (it takes usually more than 1 compilation action to reach the 10k+ file opened at a given time), which obviously wasn't ideal IRT build time / parallelism.

oliviernotteghem avatar Aug 08 '22 22:08 oliviernotteghem

We are also pruning transitive deps in library targets and did not face issue there. This happens in the binary target so transitives are not avoidable. More specifically we get this error when Dagger tries to generate classes in the Kapt action.

Bazel team prefers to adjust individual worker definitions instead of a global fix.

arunkumar9t2 avatar Aug 09 '22 06:08 arunkumar9t2

For more context, there is a similar change in bazel core here: https://github.com/bazelbuild/bazel/commit/30dd8715847e4c797cd28da13e14eab7721b518d

For us we needed this following change: https://github.com/bazelbuild/bazel/pull/15978

Which allowed us to add the following .bazelrc to solve Too many open files on MacOS

build:macos --host_jvmopt="-XX:-MaxFDLimit"
build:macos --jvmopt="-XX:-MaxFDLimit"

I think the cleanest fix would be to add this default on the low level toolchain and only apply it for macs/darwin

nkoroste avatar Aug 12 '22 20:08 nkoroste