cecil icon indicating copy to clipboard operation
cecil copied to clipboard

Use Microsoft.DiaSymReader.Native to write Windows PDBs

Open tmat opened this issue 7 years ago • 25 comments

For writing Windows PDB Mono.Cecil currently relies on globally registered SymWriter from diasymreader COM component that ships with .NET Framework. This component is not updated except with security patches.

Microsoft.DiaSymReader.Native package distributes the latest version of the SymWriter that includes fixes and new features (e.g. determinism). For convenience Microsoft.DiaSymReader package provides COM interface definitions and helpers for creating SymWriter and writing PDBs that can be also used by Mono.Cecil to simplify the implementation.

This change adds Mono.Cecil.WindowsPdb project that implements PDB writer using Microsoft.DiaSymReader packages. Since these packages require at least net45 or netstandard1.1 this new project is only built when targeting netstandard.

The change also enables building of all tests under netstandard configuration and adds Mono.Cecil.WindowsPdb.Tests project. The tests currently target net462 -- a bit more work would be needed to target netcoreapp and get them running under CoreCLR.

tmat avatar May 23 '17 04:05 tmat

@jbevain Please take a look and let me know what you think.

tmat avatar May 23 '17 04:05 tmat

This package is not open source. Is it possible to break this out somehow so it is not forced on everyone using the Cecil NuGet package?

MI3Guy avatar May 24 '17 17:05 MI3Guy

@MI3Guy

Microsoft.DiaSymReader.dll is open source (https://github.com/dotnet/symreader). Microsoft.DiaSymReader.Native.*.dll are not open source. The current implementation uses the one installed on your Windows machine, which is also not open source.

I'm open to suggestions on how to "break this out". Perhaps create a Mono.Cecil.Windows package?

tmat avatar May 24 '17 17:05 tmat

Well, it looks like the NuGet package is not, even if the source is open. (https://www.nuget.org/packages/Microsoft.DiaSymReader/)

I don't really know how to best break it out. I just don't really like making open source code have a hard dependency on a closed source library if it is avoidable.

MI3Guy avatar May 24 '17 17:05 MI3Guy

The nuget package is currently pre-release and thus on myget: https://dotnet.myget.org/feed/symreader/package/nuget/Microsoft.DiaSymReader. It will be published to nuget once it's release quality.

I hear you. I'm just saying Cecil has a hard dependency on closed source diasymreader.dll already. It invokes it thru COM.

tmat avatar May 24 '17 17:05 tmat

I guess is not that "hard" since you're not getting it if you're not on Windows.

tmat avatar May 24 '17 17:05 tmat

Well, even on myget it points to the EULA license. I know there are some efforts going on though to change the package licensing though.

And FWIW I have been using Cecil on Linux quite a bit.

MI3Guy avatar May 24 '17 18:05 MI3Guy

There are multiple things at play here.

  • This takes a dependency on another nuget package for .NET Desktop, which we've been avoiding so far.
  • We have an existing code base that currently mostly works.
  • @MI3Guy that wouldn't impact your ability to use Cecil on Linux or OSX. You've never been able to write native PDBs on Linux, that wouldn't change.

jbevain avatar May 24 '17 22:05 jbevain

Re license: I think @MI3Guy is concerned about the Cecil package having a dependency on packages with MS-EULA license. We are working on sorting that out (changing the license of the packages).

This takes a dependency on another nuget package for .NET Desktop, which we've been avoiding so far.

Is there particular reason to avoid it?

We have an existing code base that currently mostly works.

Wouldn't it be desirable for it to work 100%? We have identified quite a few issues when using Cecil on test cases that we use to validate PDB tools we build. Some of them are addressed by this change and we plan to follow up on the rest as well.

tmat avatar May 24 '17 22:05 tmat

Being self-contained without any external dependency (but the framework) is one of the pillar of Cecil.

I agree working 100% with native PDBs is desirable. Do we have a list of issues that need fixing? If it's simply about adding new features to the writer, we might as well do it. It's not like using this for the writer will solve being up to date with upstream PDB changes as we still need to maintain the reader part.

jbevain avatar May 24 '17 23:05 jbevain

On Windows Cecil is actually not self-contained. It has an external dependency on diasymreader.dll that's shipping with Windows thru COM instantiation. That component is buggy. We have been fixing these bugs and distributing the fixed binary in Microsoft.DiaSymReader.Native.nupkg. The library in Windows only gets critical security fixes. Even if we updated it (unlikely due to backward compat concerns) the behavior of Cecil would change based on what version of Windows is it running on. I don't think that's desirable. Besides we also add new features such as determinism, source link and embedded sources. Cecil won't be able to handle these if it keeps calling into diasymreader.dll.

Re the list of issues - yes, we have a list. Some of them need to be fixed in Cecil itself, others can't be as they symptoms of bugs in diasymreader.dll.

  • Document checksum is not emitted in PDBs (Cecil)
  • PDBs are missing some using-namespace information (Cecil)
  • Implement support for Custom Debug Information: Tuples (Cecil)
  • Implement support for Custom Debug Information: EnC (Cecil)
  • Implement support for Custom Debug Information: Dynamic (Cecil)
  • Values of some types of constants are not encoded correctly (Cecil, can reuse the encoding implemented in Microsoft.DiaSymReader.dll)
  • Make PDBs deterministic (diasymreader.dll)
  • Implement support for Source Link (diasymreader.dll)
  • Implement support for embedded sources (diasymreader.dll)
  • Custom debug information is dropped in some cases (diasymreader.dll)

tmat avatar May 24 '17 23:05 tmat

Re the reader: We can replace the reader as well. Microsoft.DiaSymReader provides the APIs needed to read all of the info.

tmat avatar May 24 '17 23:05 tmat

For all intent and purpose, Cecil on windows/.net framework is currently self-contained. You take Mono.Cecil.dll and Mono.Cecil.Pdb.dll and you're good to go (with the issues you mentioned of using diasymreader.dll that shipped with .NET, but it worked so far). Taking a dependency on Microsoft.DiaSymReader.Native means that if someone wants to use Cecil to read native PDBs they need to ship that DLL as well.

I agree with the pros of rebasing our native pdb implementation on Microsoft.DiaSymReader.Native.

Using this assembly for reading as well means losing the ability to read native pdb on non Windows platforms.

Currently I'm thinking the best way to move forward would be to provide this as an alternative native pdb reader/writer, ISymbolReaderProvider/ISymbolWriterProvider implementation.

jbevain avatar May 25 '17 15:05 jbevain

Taking a dependency on Microsoft.DiaSymReader.Native means that if someone wants to use Cecil to read native PDBs they need to ship that DLL as well.

That's correct. I wouldn't think shipping another couple of files (Microsoft.DiaSymReader, Microsoft.DiaSymReader.Native) would be a problem as long as they are neatly packaged together. That said, I'm not familiar with all the distribution requirements of Cecil so I absolutely leave that decision up to you.

Using this assembly for reading as well means losing the ability to read native pdb on non Windows platform

Fair point. I think it shouldn't be hard to add the missing features to the managed reader. So perhaps we can keep it.

Currently I'm thinking the best way to move forward would be to provide this as an alternative native pdb reader/writer, ISymbolReaderProvider/ISymbolWriterProvider implementation.

Sounds reasonable. I'll explore that approach and update the PR.

tmat avatar May 26 '17 00:05 tmat

@jbevain Looking into this again. If we have two providers how do we decide which one to load in SymbolProvider.GetReaderProvider? Do we try load the DiaSymReader one first and if it isn't available load the current one?

tmat avatar Jun 12 '17 20:06 tmat

@jbevain Alternatively we could keep it as is and anyone who wants to use the DiaSymReader-based reader/writer would need to pass the symbol provider explicitly.

tmat avatar Jun 12 '17 21:06 tmat

@tmat I think that's indeed the safest route right now. You could even provide your own provider to dispatch to DiaSymReader/Writer or the integrated PortablePdbReader/Writer as in:

https://github.com/jbevain/cecil/blob/master/symbols/pdb/Mono.Cecil.Pdb/PdbHelper.cs#L37

jbevain avatar Jun 12 '17 22:06 jbevain

@jbevain Added a separate provider as discussed above.

tmat avatar Jun 18 '17 20:06 tmat

@jbevain Do you have any feedback on this change? Is it ok to merge?

tmat avatar Jun 27 '17 19:06 tmat

@tmat, I'm currently traveling, will review when am back :)

jbevain avatar Jun 28 '17 07:06 jbevain

That PR mixes what should have been at least 4 different independent changes :(

jbevain avatar Jul 19 '17 01:07 jbevain

@jbevain There are 3 independent commits. I can send 3 PRs, each having the respective commit, if you prefer.

tmat avatar Jul 19 '17 01:07 tmat

@jbevain Should I split this change to 3 PRs?

tmat avatar Aug 01 '17 00:08 tmat

@jbevain we'd appreciate if you can give @tmat your feedback about split/merging this PR at your earliest convenience. We would like to have this fix checked in as soon as possible. Thanks.

swaroop-sridhar avatar Aug 03 '17 02:08 swaroop-sridhar

Folks, I was in vacation, and I maintain Cecil on my spare time.

Please do split the PR to make it easier to review and iterate over them.

jbevain avatar Aug 08 '17 16:08 jbevain