vscode-mlir icon indicating copy to clipboard operation
vscode-mlir copied to clipboard

Go to Definition in #included TableGen files

Open lenary opened this issue 2 years ago • 2 comments

I am not sure if this is a bug or not, or whether I'm using the tablegen language server incorrectly.

I modified the LLVM CMake to generate an entry in tablegen_compile_commands.yml for every single tablegen'd file. This contains duplicates (as the one on main does if you build and include MLIR in the enabled projects), but all duplicates have the same include list.

I'm working on the AArch64 backend (in llvm-project/llvm/lib/Target/AArch64), and I don't think that "Go To Definition" works in some cases.

For instance:

  • In AArch64.td, trying "Go To Definition" on SubtargetFeature<... correctly takes me to the definition of SubtargetFeature in Target.td.
  • In AArch64InstrInfo.td, trying "Go To Definition" on anything doesn't work (I also get a problem on the first definition of not being able to find the class). The same applies in e.g. AArch64SystemOperands.td There are not filepath: entries for these files, because they are not passed to llvm-tblgen directly.

Is this because the extension doesn't cope well with how LLVM normally structures the tablegen in backends? For instance, in the AArch64 backend, AArch64.td contains a few definitions, and then a lot of #include for the other AArch64*.td files, and it is only AArch64.td which is run through llvm-tblgen. Some of the definitions in the AArch64*.td files rely on definitions in AArch64.td, but do not #include that file because the inclusions go in the other direction.

Presumably there are methods to solve this, as C/C++ plugins do cope with definitions/declarations in header files (which do not normally show up in a compile_commands.json), but I don't know how complex that would be to implement.

lenary avatar Aug 05 '22 15:08 lenary

Is this because the extension doesn't cope well with how LLVM normally structures the tablegen in backends? For instance, in the AArch64 backend, AArch64.td contains a few definitions, and then a lot of #include for the other AArch64*.td files, and it is only AArch64.td which is run through llvm-tblgen. Some of the definitions in the AArch64*.td files rely on definitions in AArch64.td, but do not #include that file because the inclusions go in the other direction.

To clarify:

AArch64InstrFormats.td does not compile when run alone, as it is expected to be included from AArch64.td after the latter has defined its subtarget features.

This is not something that users should fix by changing how inclusions work, because:

  • LLVM tablegen expects there to be one central tablegen file per backend.
  • TableGen doesn't like redefining the same definition twice.

lenary avatar Aug 16 '22 13:08 lenary

Sorry for the delayed response (I didn't have notifications on for some reason).

For the problem of not finding definitions for things, TableGen effectively provides no location information for anything, which means right now we can only really provide definitions/references for classes (and even then, just in limited situations). I've been meaning to rewrite the internals to actually propagate locations (plus enable code completion/etc.), but I've been swamped with other things for the past few months. I'm hoping to get back to this soon though.

For the problem of non-self contained tablegen files, I generally view that as historical/legacy. MLIR also heavily uses TableGen, but we don't really suffer from any of these problems. A few years ago I added support for including the same file multiple times, which coupled with preprocessor include directives (i.e. ifndef/define/undef) let's you create a similar model of inclusion to C++ (you include things that you depend on). TableGen still requires a single file to process, but you can build .td files that are self contained (and more cleanly abstracted).

That being said, I do recognize that a lot of tablegen code has already been written. The main problem here is that the current modeling of the language server is based around self-contained files. To fix this we'd need to process the compile_commands file and determine the layout of the project, trying to process files that are related (i.e. building an index like clangd does). This is kind of necessary to do at some point anyways, at least if we want things like "find all references" to actually work. This isn't hard, conceptually, but requires care given that building an index can be extremely expensive (both in memory and time).

River707 avatar Aug 29 '22 06:08 River707