bazel icon indicating copy to clipboard operation
bazel copied to clipboard

bazel crashed due to an internal error. Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@WORKSPACE'

Open sgammelmark opened this issue 2 years ago • 11 comments
trafficstars

Description of the bug:

I am setting up a new project to build with bazel, but when I run bazel test //..., I am running into the error 'java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@WORKSPACE' (requested by nodes '[/Volumes/code/bazel/$$$WORKSPACE_NAME$$$/[external/WORKSPACE]')' where $$$WORKSPACE_NAME$$$ is the name of the workspace and /Volumes/code is a case-sensitive volume on macOS.

I have been unable to pinpoint the trigger cause. I can run bazel test //tests/... and bazel test //:all but not bazel test //tests/...

It is a workspace primarily with python code, that refers to another local workspace using a relative path using local_repository like this

local_repository(
    name = "common",
    path = "../path/to/common",
)

We are using this approach in a transitional period. It looks like it happens when bazel tries to access the 'common' repository.

Reported stack trace was

bazel test //...
Loading: 0 packages loaded
FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@WORKSPACE' (requested by nodes '[/Volumes/code/bazel/SECRET]/[external/WORKSPACE]')
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:633)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:365)
	at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: java.lang.ClassCastException: class com.google.devtools.build.lib.packages.InputFile cannot be cast to class com.google.devtools.build.lib.packages.Rule (com.google.devtools.build.lib.packages.InputFile and com.google.devtools.build.lib.packages.Rule are in unnamed module of loader 'app')
	at com.google.devtools.build.lib.packages.Package.getRule(Package.java:676)
	at com.google.devtools.build.lib.repository.ExternalPackageHelper$ExternalPackageRuleExtractor.processAndShouldContinue(ExternalPackageHelper.java:144)
	at com.google.devtools.build.lib.repository.ExternalPackageHelper.iterateWorkspaceFragments(ExternalPackageHelper.java:118)
	at com.google.devtools.build.lib.repository.ExternalPackageHelper.getRuleByName(ExternalPackageHelper.java:52)
	at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.getRepoRuleFromWorkspace(RepositoryDelegatorFunction.java:444)
	at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:280)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:562)
	... 7 more

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I am unable to extract a minimal example (without also extracting proprietary information). I am open to help construct a minimal example or gather additional logs, if I can reproduce the error consistently.

Which operating system are you running Bazel on?

macOS 13.6 (22G120)

What is the output of bazel info release?

release 6.4.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

sgammelmark avatar Nov 13 '23 13:11 sgammelmark

Thanks for the reporting, is it possible to construct a minimal reproducible case that you can share?

meteorcloudy avatar Nov 14 '23 15:11 meteorcloudy

hmm. this might be as simple as your workspace name being the literal string "WORKSPACE". Could you try changing the workspace(name=X) clause in your WORKSPACE file to not say "WORKSPACE"?

Wyverald avatar Nov 14 '23 19:11 Wyverald

or maybe not the workspace name itself, rather some repo being named "WORKSPACE".

Wyverald avatar Nov 14 '23 19:11 Wyverald

Thanks for the reporting, is it possible to construct a minimal reproducible case that you can share?

The workspace is not named 'WORKSPACE' but named after the internal project name,

sgammelmark avatar Nov 16 '23 12:11 sgammelmark

or maybe not the workspace name itself, rather some repo being named "WORKSPACE".

I cannot find which one it might be - there is no folder in 'external' called WORKSPACE.

@meteorcloudy is there some additional logging I can enable that could help me narrow down what triggers it?

sgammelmark avatar Nov 16 '23 12:11 sgammelmark

I think this is the same as https://github.com/bazelbuild/bazel/issues/14257.

A simple way to reproduce this in any workspace is to add a symlink to the workspace's output base (I found myself doing this out of convenience because I'm in the habit of inspecting stuff in output_base/external to understand what other rules/repos are doing):

ln -s $(bazel info output_base) bazel-output-base

then run any query and it should crash:

❯ bazel query //...
INFO: Invocation ID: e58a0685-cb59-471a-882e-b6d74c9e62f9
Loading: 0 packages loaded
FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@WORKSPACE' (requested by nodes '[/private/var/tmp/_bazel_mattbrown/842deef9a683cc84bf620dbdff1ec518]/[external/WORKSPACE]')
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:633)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:365)
        at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: java.lang.ClassCastException: class com.google.devtools.build.lib.packages.InputFile cannot be cast to class com.google.devtools.build.lib.packages.Rule (com.google.devtools.build.lib.packages.InputFile and com.google.devtools.build.lib.packages.Rule are in unnamed module of loader 'app')
        at com.google.devtools.build.lib.packages.Package.getRule(Package.java:676)
        at com.google.devtools.build.lib.repository.ExternalPackageHelper$ExternalPackageRuleExtractor.processAndShouldContinue(ExternalPackageHelper.java:144)
        at com.google.devtools.build.lib.repository.ExternalPackageHelper.iterateWorkspaceFragments(ExternalPackageHelper.java:118)
        at com.google.devtools.build.lib.repository.ExternalPackageHelper.getRuleByName(ExternalPackageHelper.java:52)
        at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.getRepoRuleFromWorkspace(RepositoryDelegatorFunction.java:444)
        at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:280)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:562)
        ... 7 more

This is easy to workaround, by adding the symlink to .bazelignore:

$ echo bazel-output-base > .bazelignore

$ bazel query //...
Starting local Bazel server and connecting to it...
INFO: Invocation ID: 0e617eef-e728-4adb-a2f9-8c6b3e2882e8
...
<prints all my packages>

I've reproduced this on Bazel 6.4.0 and 7.0.2.

mattnworb avatar Feb 28 '24 03:02 mattnworb

@mattnworb Thanks! Can you please try to use bazelisk bisect to identify which commit is the culprit?

meteorcloudy avatar Feb 28 '24 12:02 meteorcloudy

I'm not sure if I can bisect it as I can't find a "good" commit to start at, going back to bazel 1.0.0 this repro case will cause the same crash:

❯ cat WORKSPACE
workspace(name = "repro-20172")

❯ cat helloworld/BUILD.bazel
java_binary(
    name = "say-hi",
    srcs = glob(["src/main/java/**/*.java"]),
    main_class = "com.mattnworb.helloworld.cli.Hello",
)


❯ cat helloworld/src/main/java/Hello.java
package com.mattnworb.helloworld.cli;

public class Hello {
  public static void main(String[] args) {
    System.out.println("Hello world!");
  }
}

mattnworb avatar Feb 28 '24 14:02 mattnworb

I accidentally found a way to reproduce this crash. It is in fact not surprising that something broke, but I think an error message would be nice.

If you create a workspace and build something, and then make a symlink to bazel info output_base/external without putting it in .bazelignore, bazel will crash (just tested on 7.1.2).

In other words, go into a workspace with an existing build and run

ln -s `bazel info output_base`/external bazel-external

without adding 'bazel-external' to .bazelignore. Then run

bazel build //...

and bazel will crash with the above message.

sgammelmark avatar May 29 '24 09:05 sgammelmark

Thanks for the repo! This is definitely not a supported use case, but Bazel should fail gracefully.

meteorcloudy avatar May 29 '24 09:05 meteorcloudy

We don't have capacity to take on this immediately, but a PR to fix would be very welcomed!

meteorcloudy avatar May 29 '24 09:05 meteorcloudy

I ran into the same issue from playing around with bazel vendor (i.e. not having it enabled anymore). Removing the bazel-external symlink worked for me. It was already .gitignored (as /bazel-*)

lalten avatar Sep 17 '24 12:09 lalten