roslyn icon indicating copy to clipboard operation
roslyn copied to clipboard

Prevent analyzer/generator authors from using banned APIs

Open RikkiGibson opened this issue 1 year ago • 1 comments

We want to leverage BannedApiAnalyzer to try and prevent analyzer/generator authors from using APIs known to be problematic.

We'd like to include BannedSymbols.txt in Microsoft.CodeAnalysis.nupkg and have projects which reference it automatically pick it up and enforce the bans in a "downstream" fashion. It looks like BannedApiAnalyzer might be tooled toward having the BannedSymbols.txt only in the consuming project, so modifications to the analyzer or the build authoring might be needed.

Some APIs we'd like to ban:

  • System.IO.File. Basically all file I/O.
  • System.IO.Directory.
  • System.IO.Path.GetTempPath. Other stuff in Path is fine, like GetFileName, etc.
  • System.Environment. Analyzers should not read their settings directly from environment variables.
  • Assembly.Load and similar APIs.
  • CurrentUICulture. Should look at CommandLineArguments instead. This can come in implicitly through things like string.ToLower but we're probably not going to go searching exhaustively for things like that at this point.

RikkiGibson avatar Aug 09 '22 22:08 RikkiGibson

System.IO.Path.GetTempPath.

This one is special because it makes use of environment variables that can be different depending on where the compiler is run. It's just a tiny little wrapper on using System.Environment.

jaredpar avatar Aug 09 '22 23:08 jaredpar

@jmarolf raised the question of whether it would make sense to introduce a TFM specifically for analyzers--basically netstandard2.0 minus the APIs we don't want you to use. The TFM would completely lack the APIs we don't want analyzers to use. We would probably warn on upgrade when analyzers don't target it.

Issue here might be that it would be difficult to take more APIs away on an ad-hoc basis, and for analyzer authors if they need to defer deleting their usage of the API, it's hard for them to suppress granularly.

RikkiGibson avatar Aug 12 '22 20:08 RikkiGibson

"System.IO.File. Basically all file I/O."

This would completely break shader compilations in all the source generators in ComputeSharp, and that's one of the biggest features of the whole library. Couple questions:

  • Would there be a way for generator authors to bypass this and still use the banned APIs if needed?
  • Alternatively, could we consider adding a new API to allow I/O for generators behind some capability check, and in a controlled fashion? Eg. I remember @CyrusNajmabadi mentioned a build running in some container might not have I/O available at all. In that case, it might be fine for the generator to fail, so what about some API that generators could use to query the capability, and then to perform even just a subset of IO operations in a way the generator can at least track or control in some way?

UPDATE: followed up with Cyrus on Discord, we might have found a way for the generator to reference the native libs it needs without needing to do IO (outside of loading them), so this might've actually not be an issue after all 😄

Sergio0694 avatar Aug 12 '22 22:08 Sergio0694

It will be possible to suppress the banned API errors. Hopefully, though, seeing the error will lead the user to investigate whether there is another way to do what they're trying to do :)

RikkiGibson avatar Aug 13 '22 02:08 RikkiGibson

Gotcha. And yeah I'm exploring a possible solution for my scenario which might just be enough, and hopefully no longer against the rules 🙂

For context: my generator is bundling 4 native .dll-s as embedded resources (2 per CPU arch, x64 and Arm64), and when it runs it will detect the CPU arch, unpack the pair of .dll-s for the current arch into a temporary directory, then P/Invoke LoadLibrary to load it up, and then do a bunch of P/Invoke-s on the exported APIs from that .dll (it's the DirectX 12 shader compiler).

I'm thinking I might try to instead just package those libs along with the generator and get them to be copied to the same output folder where the generator assembly is (maybe with the help of some .targets or something). From there, I can just P/Invoke LoadLibrary from that path, and Cyrus mentioned that'd technically be considered fine because I'm just loading an assembly as if it was a normal assembly reference, which is allowed. But I would no longer just be extracting a file to an arbitrary temp folder on the disk.

Might end up asking Sam for some MSBuild help, but this seems very promising. Disregard my previous comment then I guess 😄

Sergio0694 avatar Aug 13 '22 02:08 Sergio0694

In general, for File IO we believe that should be done in an MSBuild task. MSBuild has the mechanisms to correctly track file input, outputs, reads, and writes. Source generators does not have this capability and it makes incremental generation extremely challenging to get right. source generators can accept MSBuild properties and additional files so the expectation is that File IO is best handled upfront and then passed to the source generator. Would like to understand any areas where this is not going to work.

jmarolf avatar Aug 15 '22 17:08 jmarolf

@jmarolf raised the question of whether it would make sense to introduce a TFM specifically for analyzers--basically netstandard2.0 minus the APIs we don't want you to use. The TFM would completely lack the APIs we don't want analyzers to use. We would probably warn on upgrade when analyzers don't target it.

Issue here might be that it would be difficult to take more APIs away on an ad-hoc basis, and for analyzer authors if they need to defer deleting their usage of the API, it's hard for them to suppress granularly.

My argument for this is that we really do not want analyzers or source generators doing file IO. It is never correct. Instead of warning about it the general design of .NET is to provide TFMs that restrict API surface area for you. If there are cases where file IO is an acceptable pattern, then we should instead focus on issuing diagnostics.

An additional benefit of a TFM is that we could provide better compiler version diagnostics at build time.

jmarolf avatar Aug 15 '22 18:08 jmarolf

Implemented in dotnet/roslyn-analyzers#6115

RikkiGibson avatar Sep 02 '22 21:09 RikkiGibson

What if an analyzer/generator properly detects a design-time build (via <CompilerVisibleProperty Include="DesignTimeBuild" />) and performs no IO in that case, but does a tiny bit of IO in full builds?

kzu avatar Mar 25 '23 20:03 kzu

What about Environment.NewLine to generate comments?

raffaeler avatar May 26 '23 10:05 raffaeler

What about Environment.NewLine to generate comments?

@RikkiGibson please answer this use case.

ADD-David-Antolin avatar Sep 18 '23 07:09 ADD-David-Antolin

@RikkiGibson

It will be possible to suppress the banned API errors. Hopefully, though, seeing the error will lead the user to investigate whether there is another way to do what they're trying to do :)

Can I take this as that the property is only for the author of source generators? Do I need to worry that someday Visual Studio might completely disable/fail on SGs that don't have the property set?

matkoch avatar Nov 15 '23 22:11 matkoch