roslyn icon indicating copy to clipboard operation
roslyn copied to clipboard

Unable to add native .dll reference to source generator project

Open myblindy opened this issue 2 years ago • 25 comments

Version Used: VS 17.0.0 Preview 2.0

Steps to Reproduce:

You can find the full source code at https://github.com/myblindy/GrimBuilding/tree/efcore (the efcore branch).

I understand that source generators can't automatically harvest dependencies from nuget packages and you have to use a clunky work-around to get it to work, and I have done so. This is my source generator project:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
    <LangVersion>preview</LangVersion>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.CodeAnalysis.CSharp" Version="4.0.0-1.final" PrivateAssets="all" />
    <PackageReference Include="Microsoft.CodeAnalysis.Analyzers" Version="3.3.2" PrivateAssets="all" />
    <PackageReference Include="SQLitePCLRaw.bundle_e_sqlite3" Version="2.*-*" GeneratePathProperty="true" PrivateAssets="all" />
    <PackageReference Include="Microsoft.Data.Sqlite.Core" Version="6.*-*" GeneratePathProperty="true" PrivateAssets="all" />
    <PackageReference Include="SQLitePCLRaw.lib.e_sqlite3" Version="2.*-*" GeneratePathProperty="true" PrivateAssets="all" />
    <PackageReference Include="SQLitePCLRaw.provider.e_sqlite3" Version="2.*-*" GeneratePathProperty="true" PrivateAssets="all" />
  </ItemGroup>

  <PropertyGroup>
    <GetTargetPathDependsOn>$(GetTargetPathDependsOn);GetDependencyTargetPaths</GetTargetPathDependsOn>
  </PropertyGroup>

  <Target Name="GetDependencyTargetPaths">
    <ItemGroup>
      <TargetPathWithTargetPlatformMoniker Include="$(PKGSQLitePCLRaw_bundle_e_sqlite3)\lib\netstandard2.0\SQLitePCLRaw.batteries_v2.dll" IncludeRuntimeDependency="false" />
      <TargetPathWithTargetPlatformMoniker Include="$(PKGSQLitePCLRaw_provider_e_sqlite3)\lib\netstandard2.0\SQLitePCLRaw.provider.e_sqlite3.dll" IncludeRuntimeDependency="false" />
      <TargetPathWithTargetPlatformMoniker Include="$(PKGSQLitePCLRaw_lib_e_sqlite3)\runtimes\win-x64\native\e_sqlite3.dll" IncludeRuntimeDependency="false" />
      <TargetPathWithTargetPlatformMoniker Include="$(PKGMicrosoft_Data_Sqlite_Core)\lib\netstandard2.0\Microsoft.Data.Sqlite.dll" IncludeRuntimeDependency="false" />
    </ItemGroup>
  </Target>
</Project>

Since there isn't even transitive support, I added every nested Microsoft.Data.Sqlite package one by one, generated its property and referenced it using TargetPathWithTargetPlatformMoniker. It all works until I get to the native e_sqlite3.dll, if I use TargetPathWithTargetPlatformMoniker with it, it tries to reference it as a managed library, and it fails as expected:

4>CSC : warning CS8034: Unable to load Analyzer assembly C:\Users\meep\.nuget\packages\sqlitepclraw.lib.e_sqlite3\2.0.5-pre20210521085756\runtimes\win-x64\native\e_sqlite3.dll : PE image doesn't contain managed metadata.

So given that the path is found, is there a different tag I can use to make the main project copy the e_sqlite3.dll file so the analyzer can use it?

myblindy avatar Jul 17 '21 19:07 myblindy

Also hitting this one (had opened #63396 though it was a dupe of this). I'm trying to update my generator in ComputeSharp to remove IO (following the conversation in #63290), and I'm currently with the following structure in my NuGet:

My generator then will P/Invoke LoadLibraryW and pick the pair of native .dll-s for the correct architecture. Right now this is just causing the analyzer to fail to load completely, because Roslyn sees those native libs in the directory free of the analyzer.

I really want to stop doing IO from the generator, and I do need to bundle those .dll-s this way to do that 🥲

@chsienki if it helps, when you start working on this feel free to ping me on Discord or Teams if you wanted another test scenario to validate the changes. I'd love to help and to try out a preview build of the toolset to double check a fix for this does resolve all issues both when going through NuGet and locally when using a project reference (I'm using both) 🙂

Sergio0694 avatar Aug 26 '22 21:08 Sergio0694

Sharing some more findings while testing https://github.com/Sergio0694/ComputeSharp/pull/349. I believe there's two different issues related to supporting this scenario (ie. analyzers that depend on native libs bundled with the analyzer .dll), depending on how one is referencing the analyzer in question. Specifically:

  1. When the analyzer is packed into a NuGet package, then Roslyn just fails to load entirely because it sees those native .dll-s, tries to load them as analyzers, and then crashes when it realizes they're not managed .dll-s. This is the issue mentioned here, where Roslyn will just crash with that "warning CS8034: Unable to load Analyzer assembly <FOO.dll> : PE image doesn't contain managed metadata." error message.
  2. When the analyzer is referenced locally as a project reference, it seems Roslyn isn't copying content elements correctly. In my analyzer .csproj, I have a few <None> items (also tried with <Content>, same issue) marked as CopyToOutputDirectory (the native .dll-s). When building the solution locally Roslyn will copy the analyzer .dll into some temporary folder (eg. C:\Users\Sergio\AppData\Local\Temp\VBCSCompiler\AnalyzerAssemblyLoader\0ea7a7ff50e341df8c5ee64fc96ecadc\11), but it won't copy those native .dll-s as well, even though they're marked as content to be copied to output. As in, if I open that folder myself I can see the analyzer .dll just being there all alone (🥲). As a result, the analyzer fails to find them at runtime and will crash when trying to call LoadLibrary on them, or loading the .dll in some other way. I was kinda expecting Roslyn to also respect files that are marked as copy to output, when copying the analyzer .dll-s into its temporary folder for the ALC to use to load the analyzer.

To be clear, issue (1) is the most important one to solve, as that'd impact consumers of the library. Issue (2) can be worked around locally as it's mostly just encountered when working locally and setting up test projects and whatnot, though it'd be nice if that scenario also "just worked" out of the box, as one would normally expect items marked as CopyToOutputDirectory to always be available next to the executing assembly.

Hope this helps! 😄

cc. @RikkiGibson

Sergio0694 avatar Sep 19 '22 21:09 Sergio0694

it won't copy those native .dll-s as well, even though they're marked as content to be copied to output

Let me blow your mind: embed the native dlls in your assembly yourself, and write them out to disk during DLL startup. If you only use run-time binding (LoadLibrary, DllImport in a separate class, etc.) the native library should already be present by the time it happens, and your scenario will work.

This is of course bordering on the ridiculous, but you're likely to get farther than waiting for the root cause of this to be fixed :(

myblindy avatar Sep 20 '22 15:09 myblindy

"Let me blow your mind: embed the native dlls in your assembly yourself, and write them out to disk during DLL startup."

My mind was not blown, as that's exactly what I've been doing for months already. But the whole point is that scenario is completely unsupported: source generators shouldn't do any IO. Not to mention it makes the setup unnecessarily complicated and ends up creating a lot of garbage in temporary folders for all the copied .dll-s.

Sergio0694 avatar Sep 20 '22 15:09 Sergio0694

Jokes aside, my biggest problem with this issue (my issue, and the first on your list) is that if the build system did literally nothing, it would all just work.

The problem only occurs because it's trying to load every dll as a managed assembly and it crashes on native ones, but if it just ignored load problems with native assemblies, everything else would work as is: the native dlls would still be copied alongside the managed ones and be ready for function binding.

myblindy avatar Sep 21 '22 15:09 myblindy

The problem only occurs because it's trying to load every dll as a managed assembly

What does "it's" refer to here? Roslyn, or something else?

CyrusNajmabadi avatar Sep 21 '22 16:09 CyrusNajmabadi

Roslyn, yes. If it just ignored native PEs, then all would be fine 🙂

Sergio0694 avatar Sep 21 '22 16:09 Sergio0694

Do you know where roslyn is loading the native dlls? can you hook the point where that happens somehow and show the stack that is doing htat?

CyrusNajmabadi avatar Sep 21 '22 17:09 CyrusNajmabadi

@CyrusNajmabadi Msbuild targets pass all dlls in the analyzer package to the compiler as /analyzer command line params. AnalyzerAssemblyLoader then loads all these dlls. The loader could definitely check whether the dll is managed and ignore native.

tmat avatar Sep 21 '22 17:09 tmat

Yeah, I'd assume the issue is originating here:

https://github.com/dotnet/roslyn/blob/dd044537d55b074e49c8ed8eab74a8cc6895389c/src/Compilers/Core/Portable/DiagnosticAnalyzer/AnalyzerFileReference.cs#L451-L469

Where the loader just tried to get the Assembly from the input .dll, which will of course crash for a native PE. If Roslyn just ignored these .dll-s (effectively doing the same it does for a managed .dll that just contains no analyzer types, ie. doing nothing), then this would completely solve the issue here 🙂

Sergio0694 avatar Sep 21 '22 17:09 Sergio0694

~~i'm confused. why is the native dll being added as an analyzer dll? it seems appropriate to me that if it's an analyzer dll and it's not, it should fail/crash :)~~

Missing tomas' explanation.

CyrusNajmabadi avatar Sep 21 '22 17:09 CyrusNajmabadi

@CyrusNajmabadi Msbuild targets pass all dlls in the analyzer package to the compiler as /analyzer command line params.

ickkkkkkkkkk. it feels like this is on msbuild. Having us try to infer that it is passing bad stuff to us and undo it seems like the wrong way to handle things. CAn we not come up with a way for them to do things differently here?

CyrusNajmabadi avatar Sep 21 '22 17:09 CyrusNajmabadi

Do you know who we should ping from the MSBuild team about this? With its behavior currently being this, and leading Roslyn to just fail to load analyzers entirely, we're a bit in a bad spot now as the only way to make it work at all currently is to manually embed and unpack libraries at runtime, which goes directly against #63290 and is explicitly unsupported 😅

Sergio0694 avatar Sep 21 '22 17:09 Sergio0694

We are supposed to receive all the analyzer dependencies in the command line. This is important for determinism (@jaredpar in case he has an FAQ handy for this :P). We could consider using another flag instead of /analyzer to give them to the compiler but it's not hugely important IMO. we just have to account, one way or another, for the possibility that an analyzer dependency is native.

RikkiGibson avatar Sep 21 '22 17:09 RikkiGibson

i'm confused. why is the native dll being added as an analyzer dll? it seems appropriate to me that if it's an analyzer dll and it's not, it should fail/crash :)

I realize you crossed that out, but I want to spell it out explicitly. The root problem is that there's no way to add dependencies to an analyzer in an explicit and clear way (for example, by marking the files in some way, or putting them in different folders, or referencing other nuget packages as dependencies only). If we had that, Roslyn could use that information and ignore dependencies when looking for active analyzers.

Instead what we have is a hack where it considers everything in the analyzers/dotnet/<lang>/Libraries to be a potential analyzer, and tries to load it as such. I want to emphasize how much of a hack that is in the .Net Core/netstandard world, where you get 50 billion little dlls next to every assembly you create, whether implicitly (linking to the GAC for netstandard 2.0) or explicitly in the binary folder. Those are pure dependencies, not an analyzer output to be loaded by VS and Roslyn.

However the major damage is done, we're stuck in this world and everything in that folder is a potential analyzer. It's far too late to break this implementation. However a tiny patch to not consider dlls it can't parse (due to them being native) will completely fix this issue. Considering the bigger problem of the current implementation, what's one more tiny patch? It won't even break anything, it's literally impossible to have a published nuget library (and thus in use) that has native dependencies in them right now. It's a purely additive change.

myblindy avatar Sep 21 '22 17:09 myblindy

what's one more tiny patch?

This is never a good argument for a change. Changes are based on their merits, trade offs, etc ..

if the build system did literally nothing, it would all just work.

If the build system did literally nothing then it would do, literally, nothing. It would not compile, it would not produce any meaningful output. The build system by design must take action on the inputs its given and interpret them as designed. In the case of /analyzer the design is to interpret the input as managed binaries.

jaredpar avatar Sep 21 '22 18:09 jaredpar

In the case of /analyzer the design is to interpret the input as managed binaries.

That's not quite exactly what I'm saying here. I'm talking about the files located physically located in the analyzers/dotnet/<lang>/Libraries folder and how they get from there to the /analyzer flag. In an ideal world, only actual analyzers would get passed, however right now everything gets passed, which leads to the issue we're having. So given that, and assuming this issue is accepted as a defect (you don't sound very convinced), there's two easy ways to fix it:

  1. Fix the code parsing the analyzer files and passing them to Roslyn to only pass either assemblies containing analyzers (more efficient, as Roslyn doesn't have to spend time to parse unrelated files) or at least only managed assemblies (less efficient, but still fixes the issue).
  2. Make Roslyn silently ignore any native libraries passed as analyzers. Probably the easiest solution, but also the hackiest.

Changes are based on their merits, trade offs, etc

Perfectly agreed, which is why I explicitly mentioned there's no trade offs here. This case is entirely not supported at all right now.

As for merits, that's being able to write analyzers (and specifically source generators) that use native libraries during processing. This next part might be more contentious and possibly easily dismissed as "not important to support", but if given the choice I would use Sqlite databases as output, and @Sergio0694 would use his HLSL compiler. Now that I'm writing this, I also have an idea related to the GLSL compiler.

myblindy avatar Sep 21 '22 18:09 myblindy

Make Roslyn silently ignore any native libraries passed as analyzers. Probably the easiest solution, but also the hackiest.

This is also a regression in functionality. This error today catches developers who are constructing incorrect NuPkg files. Or packaging up DLLs that are meant to be managed but are invalid due to a tool in their build. Consider as an example IL rewriters that can end up producing invalid assemblies. The proposal here is essentially removing an existing safety guard from the ecosystem by silently ignoring this type of failure.

It's reasonable to debate how prevalent this type of problem is but it's definitely not consequence free.

So given that, and assuming this issue is accepted as a defect (you don't sound very convinced), there's two easy ways to fix it:

Other generator / analyzer authors have approached this problem by either passing the dependency via /additionalFiles or embedding them as resources. Given they've been successful with that my instinct is to push back and ask for reasons why this won't work in your case.

jaredpar avatar Sep 21 '22 19:09 jaredpar

"or embedding them as resources"

I'm one of those, as that's what I'm currently doing in ComputeSharp. As I mentioned though that has major issues, because:

  • The logic to extract the dependency at runtime is complicated compared to just loading a library.
  • Need to extract into a temporary folder, so you end up leaving a lot of garbage files on the machine that's using the analyzer. Over time this also accumulates a bit. Not a bit deal since it's temp files, but annoying nonetheless.
  • This means the analyzer is doing IO, which as mentioned in #63290, is a completely unsupported scenario.

Not familiar with that approach using additional files, are there some resources on that?

Sergio0694 avatar Sep 21 '22 19:09 Sergio0694

I think the part of the problem is the build system is passing the native dependency as /analyzer when the user doesn't want that to happen, and there isn't a clear way to prevent that. Also, issue #47292 implies that native dependencies can't be passed through /additionalFiles.

RikkiGibson avatar Sep 21 '22 19:09 RikkiGibson

Doing random IO would be a problem, as it would represent machine state not encoded into your inputs. Using IO to effectively just load data you are shipping is likely less of a problem.

CyrusNajmabadi avatar Sep 21 '22 19:09 CyrusNajmabadi

This is also a regression in functionality. This error today catches developers who are constructing incorrect NuPkg files. Or packaging up DLLs that are meant to be managed but are invalid due to a tool in their build. Consider as an example IL rewriters that can end up producing invalid assemblies. The proposal here is essentially removing an existing safety guard from the ecosystem by silently ignoring this type of failure.

Is reporting errors for assemblies that have incorrect CorFlags set more important for Roslyn analyzer ecosystem than simplifying loading native analyzer dependencies? Why do we care about bad IL rewriters producing bad analyzer assemblies? Seems like super edge case.

tmat avatar Sep 21 '22 20:09 tmat

Seems like super edge case.

I agree it's an edge case. The claims being made are this is "it won't break anything". I'm pushing back on that notion. Every change has a consequence, ignoring them doesn't benefit the discussion.

jaredpar avatar Sep 21 '22 20:09 jaredpar

It seems like we should be able to distinguish this doesn't have managed metadata from this has malformed managed metadata, and continue to give an error on the latter if we think that has value.

RikkiGibson avatar Sep 21 '22 21:09 RikkiGibson

"Using IO to effectively just load data you are shipping is likely less of a problem"

@CyrusNajmabadi Sure, and in fact that's what we're currently doing, but it's still not ideal because (1) it adds extra complexity, (2) once the new analyzers for banned APIs go online, we'll also get warnings on all those IO APIs we'll have to disable, (2) it adds overhead, and (3) it leaves a lot of garbage in temp folders for consumers 🥲

Sergio0694 avatar Sep 22 '22 00:09 Sergio0694

Picking this up now that 17.4 is mostly done, have we reached a conclusion on what the correct approach (if any) should be used to solve this issue? To recap, @RikkiGibson's proposal makes sense to me:

"It seems like we should be able to distinguish this doesn't have managed metadata from this has malformed managed metadata, and continue to give an error on the latter if we think that has value."

That would completely solve the issue while also preserving existing behavior in other cases, in case that was desired 🙂

Sergio0694 avatar Oct 26 '22 10:10 Sergio0694