rules_jvm_external
rules_jvm_external copied to clipboard
Maven install with lock manifest is making inefficient use of caches
I've been for a while struggling with those _extension targets we create, we have a setup where we're using build buddy open source over grpcs for caching and bes. Our repository has 3 maven_install trees of at least 300 coordinates each, on top of that we have a few maven dependencies that contain resources and are really large, literally 250MB for some testing.
Whenever the network connection to the remote cache is flaky we end up spending a lot of time in updating the cache or downloading from the cache this _extension targets.
After looking at the source code I noticed two problems, which seem to have been unnoticed for long time as git blame shows it got refactored from coursier over 5 years ago.
The first problem is the use of genrule which doesn't provide a mnemonic like other actions and doesn't allow override it's execution features to disable remote caching or remote execution. By using a custom rule with a mnemonic the consumers of rules_jvm_external could disable remote caching as the operation is very IO sensitive and very lightweight to do locally.
The second problem is that it uses cp, we literally copy the http_file into the @maven repository, while instead we could just have a symlink when the OS allows.
This whole makes me thing adding a custom private rule for this, or depending on something like aspect Bazel lib copy_file would make things better and hopefully even speed up CI operations on large repositories. As it would reduce the IO load.
I'm going to hack something soon, but yet I wanted to open the discussion, most likely I'm missing something here.