msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

Analyzers prototyping - Loading and acquisition

Open JanKrivanek opened this issue 5 months ago • 11 comments

Context

#9627

Built-in analyzers will be part of MSBuild (but separate assembly probably?), custom will be possible to plug-in post-installation. Analyzer management module will need to be able to discover and load analuzers - to provide them to the infractructure (that can then register and configure them based on the default and user configuration)

Things to be investigated and decided

  • Just investigation and proposal (no implementation yet - V1 will be just manual) - The story of acquiring custom analyzers - using msbuild restore phase would be convenient for this but might prevent execution of some analysis (as analyzers might not be available during restore). Another option is similar story to template engine - dedicated commands for searching/listing/installing analyzers while searching and installing use nuget client and server API.
  • Roslyn current implementation and what can we reuse

JanKrivanek avatar Jan 11 '24 12:01 JanKrivanek

Brain dump

I think we agree that the ideal situation for .NET projects is to get MSBuild analzyers via NuGet package reference, just like acquiring additional build logic or Roslyn source analyzers/generators.

For C++ NuGet isn't pervasive so we might need to coordinate with them for their ideal desires.

NuGet can have assets in the project.assets.json file that get rendered into items at build time, and can also set up build logic that will be imported by the project evaluation that happens at build time.

Any registration of analyzers within the project will run into similar problems: some of the analysis rules we'd like to enable would be best run during evaluation--things like rules on condition expressions, rules about defining or not defining certain properties in certain locations, and so on. If we haven't discovered and enabled all of the relevant analyzers already before evaluation starts, we either can't have rules like that or have to unconditionally buffer all the relevant analysis state or events until we do, then replay them through the analyzers.

Roslyn analyzers don't have this problem because they can't be discovered within the compiler--they're discovered by the MSBuild layer and then a flat list is passed to the compiler at its initialization/command line start time.

That's an option for MSBuild analyzers too but has two big problems:

  • Cumbersome to enable (you have to change a script or something that invokes MSBuild or the Directory.Build.rsp and with some kind of absolute path)
  • Incompatible with package-based distribution of analyzers (they wouldn't exist on a fresh clone, so even if you were willing to construct the path dotnet build -buildanalyzer:/home/user/.nuget/whatever/net8.0/WhateverAnalyzer.dll wouldn't work during Restore.

Analyzers could be entirely offline--working on a binlog replay, so you can assemble the analyzers during build somehow and then analyze afterward. This could even be transparent to the user (if build discovers analyzers automatically replay the binlog after the build completes but before we exit).

There could also be special handling for analyzers in NuGet restore--some kind of special file that we look at before starting to evaluate a project that has analyzer registrations in it. However, this would ideally be in obj, just like project.assets.json, and the location of obj is configurable via property--so we might have to (partially?) evaluate the project to find the analyzers then evaluate the project!

There's a possibility of making some kind of analyzer configuration note in memory during restore and using it during build, but that wouldn't work when the restore and build steps are in separate processes/invocations.

rainersigwald avatar Jan 18 '24 22:01 rainersigwald

C++ is a great point! - I haven't thought about that much. Lets return to it once we have some stronger opinion on the net.sdk

What is your thoughts on offering standalone CLI for handling analyzers? Something along the lines of

> git clone <your repo> && git checkout <branch> && git pull
...
> dotnet build install-analyzer <package1> <package2> <package3>
...
> dotnet build <proj>

While not leveraging current execution paths (so partially 'reinventing the wheel'), it doesn't suffer the disadvantages of current restore execution path.

We can reuse (or even generalize and import) the logic and experience that's part of templating engine (dotnet new install, dotnet new search, dotnet new list) - allowing users to discover and use analyzers with low friction. We can still allow to reference them as standard nugets - but for the ones that need evaluation time context we can e.g. issue warning pointing the partiall blindness of the analyzer installed that way and option to overcome this with dedicate CLI.

All that wouldn't of course be a V1 thing.. but if seen as viable way, we'd know that for V1 we can rely on fixed location containing analyzers.

JanKrivanek avatar Jan 19 '24 10:01 JanKrivanek

As a flow I don't mind that too much but how would it work concretely? What would dotnet build install-analyzer do and how would subsequent dotnet build invocations pick it up?

rainersigwald avatar Jan 19 '24 14:01 rainersigwald

As a flow I don't mind that too much but how would it work concretely? What would dotnet build install-analyzer do and how would subsequent dotnet build invocations pick it up?

dotnet build install-analyzer will allow to restore custom analyzers (nuget package?) - basically it downloads and unpacks them.

YuliiaKovalova avatar Jan 19 '24 14:01 YuliiaKovalova

Maybe first 1 important question I didn't put nowhere in this item or discussion: Where can we afford to have the declaration of used analyzer?

  • Does it have to be part of PackageDependencies in the project file (same as is the case for the Roslyn analyzers)?
  • Can we afford to have something of our own? - e.g. 'analyzers.json' as a sibling to the project file
  • Is there anything else existing - other than msbuild files - we can plug ourselves into? E.g. if there would be a way how to express this in .editorconfig files it would be awesome

I guess you know where I'm heading :-) - the acquisition as part of restore will bring complications - can we avoid this alltogether? That's how I was imagining the separate CLI would work

If we absolutely need to follow the suit of PackageReferences - then we're brainstorming with @YuliiaKovalova a two-staged acquisition:

  • first stage - all analyzers that are already restored in the known locations - those would be 'woken-up' without any restrictions
  • second stage - during restore (which is not standalone invocation, but part of build invocation) - the analyzers woken-up during this stage would be just fine if they declare they use post-eval context only, but they would lead to some predefined (and silentable) warning from the infra informing that analyzer XYZ requires (pre-)eval data and hence might not be possibly able to operate fully when not restored in a standalone dotnet restore invocation.

It would be nice to brainstorm (not necessarily stick to) some solution(s), that do not require sacrifice in a form of over-buffering or over-emitting data, because we do not know yet if they will be needed at some point.

JanKrivanek avatar Jan 19 '24 14:01 JanKrivanek

Can we afford to have something of our own?

I would rate this as a big con of an approach that requires it, but not necessarily rule it out entirely.

rainersigwald avatar Jan 19 '24 15:01 rainersigwald

basically it downloads and unpacks them.

That part makes sense, but how does a subsequent build invocation know where they are and that they should be hooked up?

rainersigwald avatar Jan 19 '24 15:01 rainersigwald

basically it downloads and unpacks them.

That part makes sense, but how does a subsequent build invocation know where they are and that they should be hooked up?

Options provided by @JanKrivanek:

  1. Hardcoded well known folder (e.g. '.analyzers' next to projectfile or in any folder up the folder structure - similar to how .git, .editorconfig etc. work), everything in that folder is loaded;
  2. Custom configuration file (e.g. analyzers.json) - again, similar location logic as .editorconfig;
  3. Mentioned in msbuild project files (same as roslyn)

YuliiaKovalova avatar Jan 22 '24 11:01 YuliiaKovalova

Could it be a command-line option read from Directory.Build.rsp? I expect this would be read early enough. MSBuild apparently supports %MSBuildThisFileDirectory% in there (implementation PR https://github.com/dotnet/msbuild/pull/4499, documentation request https://github.com/MicrosoftDocs/visualstudio-docs/issues/9971), so the option could even reference an analyzer at a directory relative to the location of Directory.Build.rsp.

KalleOlaviNiemitalo avatar Jan 22 '24 13:01 KalleOlaviNiemitalo

Existing constraints:

  1. VS doesn't rely on the command line, and we may need to provide an alternative way of communication if analysis is required for the build.
  2. Modifying the Restore target can be problematic since it is a feature request for the NuGet team.
  3. The existing mechanism adds information about packages to the .props file in the obj folder too late; some evaluation data may not be available anymore.
  4. Creating a separate CLI is conforming to the current standard setup by Roslyn analyzers.
  5. Hooking using .rsp file is possible, but analyzer packages may not be restored yet. .rsp capabilities are restricted now and don't allow to do package restore.
  6. Adding sources to the repo (the plan to have a folder with all rules) isn't a good practice and can cause pushback from customers (perhaps we can put the sources in the .vs folder).
  7. Currently, handling Roslyn rules is managed by the SDK itself, which goes through package.json files and searches for dedicated packages using pattern matching - we don't plan to rely on this experience.

Subproblems to the mentioned above: 1 - "checked into the repo" == how do we express what analyzers are wanted (probably the package reference, but would we need something else as well?). 2 - "at or after restore" == this is the part where we'll need to somewhere (obj? .vs? elsewhere?) store information that analyzers are wanted and where to find them. 3 - "for each analyzed build invocation" == this is the part where we need to be able to 'wake up analyzer' and if we detect that it's not woken up early enough, then give some meaningful error with good workaround.

YuliiaKovalova avatar Jan 24 '24 13:01 YuliiaKovalova

If MSBuild is building multiple projects in the same invocation (via the MSBuild task, or a graph build ProjectReference), then is it desired that each project be able to have different analyzers?

KalleOlaviNiemitalo avatar Jan 24 '24 14:01 KalleOlaviNiemitalo