rules_jvm_external icon indicating copy to clipboard operation
rules_jvm_external copied to clipboard

Repositories order and performance regression in 6.8

Open al-babych-fivetran opened this issue 3 months ago • 2 comments

There is an issue with the order of repositories during lock file generation that was introduced in 6.8. I created a simple repo with an example https://github.com/al-babych-fivetran/rules_jvm_external_issue/commits/main/.

It has two commits (6.7, 6.8).

The maven.install configuration has several repositories:

repositories = [
      "https://repo1.maven.org/maven2/",
      "https://aaaa.com/maven/",
      "https://cccc.com/maven/",
      "https://bbbb.com/maven/",
      "https://dddd.com/maven/",
  ],

The lock file was generated by the bazel run @maven//:pin command. In version 6.7, the order of repositories in the lock file is exactly the same as in the repositories option:

Image

But in version 6.8, they have alphabetical order:

Image

The build tries to download the packages from these sources first. For this tiny repo, the building with version 6.7 takes ~43s:

INFO: Found 1 target...
Target //:example up-to-date:
  bazel-bin/example
  bazel-bin/example.jar
INFO: Elapsed time: 43.344s, Critical Path: 31.71s
INFO: 1252 processes: 9 internal, 1239 darwin-sandbox, 4 worker.
INFO: Build completed successfully, 1252 total actions

While the build for version 6.8 has a lot of warnings and took ~127s:

...
WARNING: Download from https://bbbb.com/maven/com/amazonaws/aws-java-sdk-accessanalyzer/1.12.732/aws-java-sdk-accessanalyzer-1.12.732-sources.jar failed: class java.io.FileNotFoundException GET returned 404 Not Found
WARNING: Download from https://dddd.com/maven/com/amazonaws/aws-java-sdk-appstream/1.12.732/aws-java-sdk-appstream-1.12.732.jar failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException Unknown host: dddd.com
WARNING: Download from https://dddd.com/maven/com/amazonaws/aws-java-sdk-signer/1.12.732/aws-java-sdk-signer-1.12.732-sources.jar failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException Unknown host: dddd.com
...
INFO: Analyzed target //:example (910 packages loaded, 7755 targets configured).
INFO: Found 1 target...
Target //:example up-to-date:
  bazel-bin/example
  bazel-bin/example.jar
INFO: Elapsed time: 127.756s, Critical Path: 30.43s
INFO: 1252 processes: 420 internal, 828 darwin-sandbox, 4 worker.
INFO: Build completed successfully, 1252 total actions

For our production repo, which has five repositories for a couple of packages (the most from Maven Central) and 600+ dependencies, the difference is 10 minutes vs. 1h10 minutes. That makes usage impossible.

al-babych-fivetran avatar Oct 14 '25 07:10 al-babych-fivetran

Can you please check this with the current HEAD of master? I believe this has already been fixed there.

shs96c avatar Oct 14 '25 08:10 shs96c

I tried with this (latest at the moment):

archive_override(
    module_name = "rules_jvm_external",
    integrity = "sha256-5gmTZetN4JTmpbPU9iunxfMewTnOJM/iNFnNClLvOF0=",
    strip_prefix = "rules_jvm_external-cc6d5b6077f67b057cb6157ea1fcaec3fadc1ad6",
    urls = [
        "https://github.com/bazel-contrib/rules_jvm_external/archive/cc6d5b6077f67b057cb6157ea1fcaec3fadc1ad6.tar.gz",
    ],
)

and still see the alphabetical order in the lock file:

Image

al-babych-fivetran avatar Oct 14 '25 09:10 al-babych-fivetran