nowinandroid icon indicating copy to clipboard operation
nowinandroid copied to clipboard

Build tuning

Open kaeawc opened this issue 1 year ago • 2 comments

What I have done and why

It was pointed out in a discussion that NowInAndroid uses a 6GB heap size which seemed high. I did a couple perf runs with current settings and no build cache or config cache. I did this at a few heap sizes like 12G, 6G, 4G, and 3G. Only at 3G do we observe a regression in build speed which is pretty easily explained by the 27 seconds spent in garbage collection - the build process was encountering GC thrashing in attempting to complete the build.

I then changed the JVM arg settings to be what I would use for tuning and while the first build at 3G was only slightly faster the subsequent ones are proving to be just as fast as all the other higher heap size ones.

Settings changed so far is just to this:

org.gradle.jvmargs=-Dfile.encoding=UTF-8 -XX:+UseG1GC -XX:SoftRefLRUPolicyMSPerMB=1 -XX:ReservedCodeCacheSize=512m -XX:+HeapDumpOnOutOfMemoryError -XX:+UnlockExperimentalVMOptions -Xmx3g -Xms3g -XX:MetaspaceSize=1g
kotlin.daemon.jvmargs=<same as Gradle>

In this PR I also want to add jemalloc to the GitHub CI as that should give roughly 10% memory savings by fixing native memory fragmentation. I did this through a custom action that caches jemalloc.

I also tried out other heap sizes with the new configuration. Anything higher than 3G didn't make a performance difference and anything lower (2G) ran the risk of the build not completing.

I'm happy to alter anything about this contribution, the goal is to help show how its possible to reduce resource usage while increasing performance.

kaeawc avatar Jun 06 '24 04:06 kaeawc

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-cla[bot] avatar Jun 06 '24 04:06 google-cla[bot]

Could you split this PR into the gradle.properties updates and the jemalloc updates separately?

The gradle.properties updates look good to me and would approve, but would like to discuss the jemalloc change in more detail.

alexvanyo avatar Jun 20 '24 19:06 alexvanyo

@alexvanyo sorry I didn't see your message, happy to split up the PR. Do you also want a separate issue?

kaeawc avatar Sep 15 '24 09:09 kaeawc

I've removed jemalloc from this PR.

kaeawc avatar Sep 15 '24 10:09 kaeawc

@kaeawc could you explain the reason for each new flag? I'm not sure they are doing anything worth including in this project. Also, since this project is meant as a showcase, many devs will likely copy/paste the code, it would be best to not add "magic" and unexplained JVM tuning. 🙏

SimonMarquis avatar Sep 15 '24 15:09 SimonMarquis

@SimonMarquis of course!

Metaspace, CodeCache, and Heap are all separate memory stores in the JVM. This goes all the way back to JDK 8.

Regarding the removal of MaxMetaspaceSize and addition of MetaspaceSize as 256MB - I outlined my research on the topic here for why Gradle projects that target JDK versions newer than 17 should try removing it. Setting a maximum without understanding of what the maximum limits in a project others will copy is arguably worse than allowing users to run into classloader memory leaks that have mostly been fixed.

CodeCache has a similar story that I'm also putting together my research for. The default for a 64bit system of 48MB is small relative to the 64-96MB that starter Android projects could take up and benefit from performance from. CodeCache is meant to have a relatively small limit.

Lastly SoftRefLRUPolicyMSPerMB is a pretty straightforward option - based on whether the JVM is configured to run as client or server mode it will release soft references in a number of milliseconds per megabyte of current heap or maximum heap sizes. The default value is 1000, which means for projects with 4GB heap size the JVM will release soft references after 68 minutes (~1 hour). 8GB heaps would release it after 2 hours. Hopefully everyone's builds are faster than that, which means their builds never release any soft references. I have consistently seen setting this to 1 instead of the default of 1000 is a win for releasing memory which can then be used for Metaspace, CodeCache, or the Heap.

kaeawc avatar Sep 17 '24 01:09 kaeawc

I'm gonna pause on merging this because the JVM args for Kotlin Daemon are not behaving as I expected, see https://youtrack.jetbrains.com/issue/KT-71564/Kotlin-Daemon-JVM-args-behavior-does-not-match-documentation

kaeawc avatar Sep 18 '24 12:09 kaeawc

So after a lot of digging I found all the JVM inheritance and defaults for Kotlin daemon and have a fix up for Kotlin's documentation on the subject https://github.com/JetBrains/kotlin-web-site/pull/4454. Knowing that we should definitely also set the same args in both as that was the original intention, however specifically for CacheCode and Metaspace:

  • -XX:MetaspaceSize Sets the size of the allocated class metadata space that will trigger a garbage collection the first time it is exceeded. This threshold for a garbage collection is increased or decreased depending on the amount of metadata used. The default size depends on the platform. Therefore we should just not set this value as setting it delays GC events on Metaspace which is a tiny performance win but making a very strange change to the JVM's behavior.
  • -XX:ReservedCodeCacheSize is already set by the Kotlin daemon and their setting is 320mb. I was previously under the impression that neither Gradle not JetBrains were setting defaults for the code cache, it would be confusing if we set one for an open source project like that that others copy and make it different. Unfortunately there is no way today to set the Gradle JVM arg and still using Kotlin Gradle Plugin's default. Therefore Gradle should be set to 256 and Kotlin to 320.

Those changes are now in this PR

kaeawc avatar Sep 21 '24 21:09 kaeawc

@kaeawc Did you actually measure performance of G1 vs ParallelGC? Because you replaced UseParallelGC with the default GC (which should be obsolete). Previously benchmarks showed that ParallelGC actually performed best, at least for debug builds: https://dev.to/cdsap/jdk-17-using-parellgc-for-the-kotlin-process-in-android-builds-24af

G00fY2 avatar Sep 27 '24 21:09 G00fY2

Yes I have, both in this and other projects. It does result in a faster build time, but at the cost of larger memory footprint. The difference in speed is not very large in the latest JVMs and in a project like this that will be copied for its settings the community should not blindly follow ParallelGC when most projects will not have the resources to spend on huge CI runners and the most expensive developer machines.

kaeawc avatar Sep 27 '24 23:09 kaeawc

G1GC is not obsolete. It is the default for a reason. Parallel used to be the default in JDK8 and prior versions, G1GC serves the widest number of applications the best, especially in the latest JDK versions.

kaeawc avatar Sep 27 '24 23:09 kaeawc

@kaeawc Hmm, your results about GC seem not to match with the benchmarks I found so far. In addition to the previously posted benchmark which uses NiA this one tested 3 different projects: https://bitrise.io/blog/post/switch-to-jdk-17-parallel-gc-and-speed-up-your-android-builds-by-9-20

when most projects will not have the resources to spend on huge CI runners and the most expensive developer machines

That's based on assumptions.

G1GC is not obsolete

I didn't meant G1GC itself is obsolete but the jvmArg -XX:+UseG1GC since it's default since Java 11. I would not expect adding arguments which match to the default.

Also another aspect: Google itself also recommends testing ParallelGC. https://developer.android.com/build/optimize-your-build#experiment-with-the-jvm-parallel-garbage-collector It could be confusing to find different GC without any explanation in the project.

Don't get me wrong. I really like this PR and thanks again for all the investigation and explanations in your blog posts. I am really into this whole build optimization topic. Just the part about the GC felt short from my opinion.

Therefore I would suggest an follow up PR which either removes -XX:+UseG1GC or reverts to -XX:+UseParallelGC. Or we would maybe need some additional comments in the gradle.properties why choosing G1GC over ParallelGC even though there are benchmarks that suggest otherwise.

G00fY2 avatar Sep 28 '24 07:09 G00fY2

@G00fY2 I hear you on using something that is the JVM default. I think let's try to establish the goals more clearly and we can figure out which path is best to take. If we do remove it from the args I'd still like to call out in a comment that we're using G1GC.

@alexvanyo @dturner what are your goals be for NowInAndroid? Would you rather have the project be efficient on resources or speed, even if that speed is actually the same? I can redo the perf tests on this size project but also want to do them on a few other sizes to show what these settings look like when scaling over time.

kaeawc avatar Sep 28 '24 09:09 kaeawc

@kaeawc Our goals for the NowInAndroid build are in order of priority:

  1. reliability
  2. simplicity of maintenance / easy to understand for developers
  3. speed
  4. resource usage

dturner avatar Sep 29 '24 21:09 dturner

@dturner in that case, considering @G00fY2's comment:

  1. Reliability - G1 is more likely to be reliable. As a codebase grows you'll hit heap size limitations faster with Parallel which will break a build.
  2. Simplicity - I can either see removing G1GC from the list of options just so its shorter, but then as a developer you need to understand what the JVM defaults are for the JDK you're running your project against. There are still a bunch of developers on old Gradle (7.x) and old JDK versions. So if we remove it I think keeping it as a comment is worthwhile.
  3. Speed - Profiled it again and the speed difference is not statistically significant for cached builds, but is significant (1 second) for clean builds in NowInAndroid. While Parallel is proven to be the faster option and recommended by Google, the docs there refer to this as "Experiment with the JVM parallel garbage collector" and "potentially improve build performance". I have seen cases where Parallel is not the best GC for speed and the cause is usually due to a lack of JVM tuning experience in a fast growing project. I think it is worth citing that link and noting that improved speed is possible.
  4. Resource usage - G1 is better for this, but its at the bottom of the list.

Based on all that I think this project should keep G1GC arg even though its a default, comment about Google's recommendation for using Parallel. If y'all want to discuss it more lets keep that going. If everyone thinks that's a decent path forward give a 👍🏻 and I'll make a PR with those changes.

kaeawc avatar Sep 30 '24 19:09 kaeawc

@kaeawc I made these changes in one of my projects and now Android Studio spins up two Gradle daemons (filed an issue). Have you been seeing anything like that?

My project uses Gradle 8.11.1, Kotlin 2.0.21, and AGP 8.7.2

eygraber avatar Nov 27 '24 09:11 eygraber

@eygraber just FYI I responded to the issue you filed - I haven't been able to reproduce. If you have a reproducible project you can share I'd be happy to help investigate further.

kaeawc avatar Nov 30 '24 13:11 kaeawc

Thanks, I saw that. Trying to find some time to put a repro together.

eygraber avatar Nov 30 '24 23:11 eygraber