intellij Proper resolution / completion of external workspaces / repositories in label definitions

Description of the feature request:

As of current master, when looking at a label definition in a BUILD.xxx file (and especially inside bzlmod enabled workspaces) I would like to be able to complete/resolve the visible repository / external workspaces. This doesn't really work as of now

Which category does this issue belong to?

Intellij

What underlying problem are you trying to solve with this feature?

No response

What operating system, Intellij IDE and programming languages are you using? Please provide specific versions.

Current plugin HEAD on mac

Have you found anything relevant by searching the web?

There are some issues related to more general bzl mod support but they seem to have been pushed in the future since internally google didn't adopt this at scale.

#6590 is another recent instance of this.

Any other information, logs, or outputs that you want to share?

I'm open to work on some of these if acceptable.

Aug 26 '24 20:08 mtoader

To add more context on my thinking here:

I want to call into bazel/blaze mod dump_repo_mapping on sync, store that in a similar way as one of the BlazeProjectData parts and use it inside an "fixed" ExternalWorkspaceFragment reference to provide ergonomic completion and resolution into <executionRoot>/external/workspace name

One open question I have is: should I try to do this using the async mode or the aspect mode?

Aug 26 '24 20:08 mtoader

Thanks @mtoader, I think we can accept contributions addressing this issue cc @tpasternak can you clarify what do you mean by async mode vs aspect mode?

Aug 27 '24 01:08 mai93

https://github.com/bazelbuild/intellij/blob/2020a233b899b63ba1c0a91bd96fe0e54d6ce606/base/src/com/google/idea/blaze/base/model/BlazeProjectData.java#L29 is kind of basis of a lot of resolving in the current plugin. It has two implementations:

https://github.com/bazelbuild/intellij/blob/2020a233b899b63ba1c0a91bd96fe0e54d6ce606/base/src/com/google/idea/blaze/base/model/AspectSyncProjectData.java#L48 which is built using bazel aspects by running build that will output both the results and the ide data needed to build the project model
https://github.com/bazelbuild/intellij/blob/2020a233b899b63ba1c0a91bd96fe0e54d6ce606/base/src/com/google/idea/blaze/base/qsync/QuerySyncProjectData.java#L37 which seems to try to do the same thing by using mostly bazel query info

However, the latter is off (the experiment flag is false as the default), as it only seems enabled within Google. It is also incomplete in the current master (see below), which makes me vary about implementing things within it.

@Override
  public TargetMap getTargetMap() {
    throw new NotSupportedWithQuerySyncException("getTargetMap");
  }

  @Override
  public BlazeInfo getBlazeInfo() {
    throw new NotSupportedWithQuerySyncException("getBlazeInfo");
  }
 // ...
  @Override
  public ArtifactLocationDecoder getArtifactLocationDecoder() {
    throw new NotSupportedWithQuerySyncException("getArtifactLocationDecoder");
  }

  @Override
  public RemoteOutputArtifacts getRemoteOutputs() {
    throw new NotSupportedWithQuerySyncException("getRemoteOutputs");
  }

  @Override
  public ExternalWorkspaceData getExternalWorkspaceData() {
    throw new NotSupportedWithQuerySyncException("getExternalWorkspaceData");
  }

  @Override
  public SyncState getSyncState() {
    throw new NotSupportedWithQuerySyncException("getSyncState");
  }

I'm not sure what is the rollout/completion plan of said functionality since if I want to plumb some metadata into BlazeProjectData (as I would have to) it would imply touching this too in some way.

Aug 27 '24 01:08 mtoader

So the qsync module has been introduced some time ago by Google, however at the moment it works well only with Android Studio, as we didn't have capacity to provide the proper support for it. This means, you can skip the qsync part for now

Aug 27 '24 10:08 tpasternak

Also, the issue occurs for bzlmod imports. Legacy imports still work well. This comes from the fact that bzlmod stores external repositories in directories, which names are not same as the repositories.

It just fails here, because it merges "external" stem with the repository name, resulting in a directory that doesn't exist https://github.com/bazelbuild/intellij/blob/b67720a686c32e98d6903bd8a49ac6ccaf9282c9/base/src/com/google/idea/blaze/base/sync/workspace/WorkspaceHelper.java#L213-L219

Aug 27 '24 10:08 tpasternak

When i checked that didn't do completion at all (neither in classical or bzlmod situatiosn). I recall solving completion when i was within Goog but i don't know what happened with that.

However in order to do it right in bzlmod world we need to use the output of bazel mod dump_repo_mapping workspace (which is what i'm gonna do).

Aug 27 '24 15:08 mtoader

oh, wait, you're right, I meant navigation, not completion

Aug 27 '24 16:08 tpasternak

As of now i have this big change here: https://github.com/bazelbuild/intellij/compare/master...mtoader:intellij:mtoader/support-module-based-repo-completion. It is still incomplete since I have to fix resolving and tweak the completion UX (fix the tab vs enter vs inside/outside quotes, etc behavior).

My q is: would the reviewers be ok with such a big change or should I make into more manageable pieces?

Note: this is fundamentally big because the ceremony is needed to hook into the sync machinery.

Aug 27 '24 17:08 mtoader

Managable pieces highly appreciated! Thank you for asking btw

Aug 27 '24 19:08 tpasternak

Posted the first PR in the series

Aug 28 '24 00:08 mtoader

Just loose thoughts for later:

we will for sure need a registry flag. I'm almost sure there will be some use case where it just crashes and we will need to provide immediate troubleshooting
Probably we need to handle bazel 5 (still supported)

Aug 29 '24 20:08 tpasternak

So far, I'm making this work only if bazel has mod dump_repo_mapping (aka bazel > 7.1.0). For the others, it will just behave as before.

Aug 29 '24 20:08 mtoader

I know we only have to ensure it doesn't crash

Aug 29 '24 20:08 tpasternak

The plugin? I'm trying to default the ExternalWorkspaceData to EMPTY in all computations. And that should only change if i can infer things from within bazel mod ... (which I gate with bazel version for now).

I could make that execution success optional and warn if it fails. I will see how hard that is.

If your bazel is > 7.1.0 and the blaze mod ... invocation fails, it will break sync. I had to add some workarounds, for example, in PR no: 3, to not have it crash for Clion (where there is a cpp-specific BlazeFlagsProvider that would add a flag unknown to mod).

Aug 29 '24 20:08 mtoader

It would be enough to just check the bazel version and skip the call if it's too low.

Aug 29 '24 20:08 tpasternak

It would be enough to just check the bazel version and skip the call if it's too low.

The last PR actually does that

Aug 29 '24 20:08 mtoader

Posted https://github.com/bazelbuild/intellij/pull/6700 to enabled resolving inside external repositories.

Still working on nice completions (for some bg: I kind of don't like that LabelReference spans the whole StringLiteral since I can resolve parts of the label independently but changing that breaks a whole set of tests and I'm not sure I want to open that can of worms for now).

Aug 30 '24 19:08 mtoader

@tpasternak I have a question. Now that the registry key is defaulted off, how will we present this change to the users (for validation / etc)? How are these changes mostly rolled out?

Sep 03 '24 17:09 mtoader

Please set it to true. I only need this flag, because I'm worried that there are some uncovered, unexpected scenarios and I want to have a quickfix already prepared :)

Sep 03 '24 17:09 tpasternak

I set the default to false because at the time the "enable_bzlmod=true" case was unhandled

Sep 03 '24 17:09 tpasternak

Set it to true in: https://github.com/bazelbuild/intellij/pull/6700

Sep 03 '24 17:09 mtoader

Ok, this part is also a blocker, I'll take care of that (the *~) ignore pattern

https://sourcegraph.com/github.com/JetBrains/intellij-community@5ccbe404f048f29a435abd48d09070cf86cb1a94/-/blob/platform/platform-impl/src/com/intellij/openapi/fileTypes/impl/FileTypeManagerImpl.java?L80-82

Sep 04 '24 10:09 tpasternak

Ok, I don't think there's any point in modifying IJ for this reason, as Bazel is not going to use the ~ separator since 8.0.0. IMO the best way would be to recommend users to add --incompatible_use_plus_in_repo_names in their bazelrc. It can be checked with starlark-sematics call.

We can also add a tooltip asking users to do it

Sep 04 '24 11:09 tpasternak

@mtoader do you think we need any improvements to the bazel mod dump_repo_mapping? @fmeum offered help if we need any

Sep 04 '24 12:09 tpasternak

@mtoader do you think we need any improvements to the bazel mod dump_repo_mapping? @fmeum offered help if we need any

There is a bit needed in order to make this work recursively: e.g. when trying to resolve an aliased target inside a module. You need to run the same command but with the mod name and you will get a different map. It would be nice jf you could get that info in one go at the start

Sep 04 '24 14:09 mtoader

Ok, I don't think there's any point in modifying IJ for this reason, as Bazel is not going to use the ~ separator since 8.0.0. IMO the best way would be to recommend users to add --incompatible_use_plus_in_repo_names in their bazelrc. It can be checked with starlark-sematics call.

We can also add a tooltip asking users to do it

Oh. I have that on in my test repo. We should add a warning if it is more than a stylistical change.

Sep 04 '24 14:09 mtoader

@mtoader do you think we need any improvements to the bazel mod dump_repo_mapping? @fmeum offered help if we need any

There is a bit needed in order to make this work recursively: e.g. when trying to resolve an aliased target inside a module. You need to run the same command but with the mod name and you will get a different map. It would be nice jf you could get that info in one go at the start

I took a look at the current usage of bazel mod dump_repo_mapping and it looks like you are invoking it with the fixed argument workspace. I'm surprised that this works for you as the arguments should be canonical repository names and the canonical name of the main repository is the empty string, not "workspace".

I see a number of ways to get the names for all relevant external repos:

Invoke bazel mod dump_repo_mapping multiple times: First, with the empty string as the single arg. Then repeatedly for each canonical repository name (aka map value) in the result until all repo names have been collected.
Add a flag to dump_repo_mapping that does this in a single invocation.
Have an argless invocation of dump_repo_mapping return all repository mappings in the Skyframe cache, similar to how bazel config does this. This can be problematic if something else invalidates Skyframe nodes between invocations, for example when the previous invocation updated the lockfile.

Could you maybe try to use the first approach? Even if that doesn't work out well, we may have a better understanding of what we would want Bazel to provide.

Sep 04 '24 14:09 fmeum

@mtoader do you think we need any improvements to the bazel mod dump_repo_mapping? @fmeum offered help if we need any

There is a bit needed in order to make this work recursively: e.g. when trying to resolve an aliased target inside a module. You need to run the same command but with the mod name and you will get a different map. It would be nice jf you could get that info in one go at the start

I took a look at the current usage of bazel mod dump_repo_mapping and it looks like you are invoking it with the fixed argument workspace. I'm surprised that this works for you as the arguments should be canonical repository names and the canonical name of the main repository is the empty string, not "workspace".

I see a number of ways to get the names for all relevant external repos:

Invoke bazel mod dump_repo_mapping multiple times: First, with the empty string as the single arg. Then repeatedly for each canonical repository name (aka map value) in the result until all repo names have been collected.

Add a flag to dump_repo_mapping that does this in a single invocation.

Have an argless invocation of dump_repo_mapping return all repository mappings in the Skyframe cache, similar to how bazel config does this. This can be problematic if something else invalidates Skyframe nodes between invocations, for example when the previous invocation updated the lockfile.

Could you maybe try to use the first approach? Even if that doesn't work out well, we may have a better understanding of what we would want Bazel to provide.

Initially, I tried it with the default root name: bazel mod dump_repo_mapping "". Unfortunately this gave me a lot more names to use (a lot of which i assume are the toolchain repos) those extra repos look like:

emotejdk17_linux_s390x_toolchain_config_repo: remotejdk17_linux_s390x_toolchain_config_repo
remotejdk21_macos_aarch64_toolchain_config_repo: remotejdk21_macos_aarch64_toolchain_config_repo
remotejdk17_linux_aarch64_toolchain_config_repo: remotejdk17_linux_aarch64_toolchain_config_repo
remotejdk17_linux_toolchain_config_repo: remotejdk17_linux_toolchain_config_repo
remote_java_tools_windows: remote_java_tools_windows
remotejdk11_win_toolchain_config_repo: remotejdk11_win_toolchain_config_repo
remotejdk21_linux_ppc64le: remotejdk21_linux_ppc64le
remotejdk11_linux_aarch64: remotejdk11_linux_aarch64
remotejdk17_linux: remotejdk17_linux
remotejdk11_linux_s390x_toolchain_config_repo: remotejdk11_linux_s390x_toolchain_config_repo
remotejdk17_macos: remotejdk17_macos
remotejdk21_macos_toolchain_config_repo: remotejdk21_macos_toolchain_config_repo

I assumed workspace worked better since bazel doc refers to @// as being the same as @workspace// and somehow using that name would work better. It does in the sense that i only get a list of "visible" repositories.

Invoking bazel mod recursively is not something i think we should do (hence why i didn't do it for now). The plan was to run bazel mod dump_repo_mapping "external_repo_canonical_name" only when i need to resolve a target inside a file from that workspace. Reasoning being: one needs this rarely since you mainly want to make changes while in your principal repository so a latency pain would be acceptable there.

But the ideal state of the world would be having two flags:

mark_implicit_deps (which would mark those remote.... as being implicit deps (or support --[no]implicit_deps as other commands)
recurse_modules: which should return a richer map starting from a root repo and recursively dump the maps available for all dependencies.

Sep 04 '24 16:09 mtoader

Ok, I don't think there's any point in modifying IJ for this reason, as Bazel is not going to use the ~ separator since 8.0.0. IMO the best way would be to recommend users to add --incompatible_use_plus_in_repo_names in their bazelrc. It can be checked with starlark-sematics call.

We can also add a tooltip asking users to do it

What would be the preferable way of notifying the users about this? I know IJ has a couple of ways to do notifications but I'm not sure what this plugin prefers. Can you ideally point to a sample?

My fear is: Since the default is to use ~ and since we need it to be + in order for this plugin to work (without changing the IDE SDK) not a lot of users will have that flag flipped and they will miss out on the bits I'm adding. I would like to at least have a reliable way of letting them know.

Sep 04 '24 18:09 mtoader

workspace works just as any other non-existent repo name: it will result in an error with --noenable_workspace (default in Bazel 8) and return a fallback only relevant with WORKSPACE otherwise. Please test with the flag and you should also see these "implicit" repos go away.

I assumed that running Bazel on demand as an external repo is accessed would be at odds with the plugins sync approach, but if it works well enough, I would recommend that approach. But if that doesn't work out, we can certainly add the recursive lookup to Bazel.

Sep 04 '24 22:09 fmeum