roslyn icon indicating copy to clipboard operation
roslyn copied to clipboard

Add ChecksumAlgorithm to project snapshot

Open tmat opened this issue 2 years ago • 9 comments

The goal is to allow EnC analyzer to check whether content of a Document matches the checksum stored in the PDB when applying EnC/Hot Reload changes. Currently we read the content from disk since we can't get the correct checksum from Document.

Adds ChecksumAlgorithm to Project snapshot (on ProjectAttributes). The value is derived from the corresponding msbuild property set in project file.

Checksum algorithm of a Document is stored on DocumentState and initialized from the TextLoader when the loader is set to the state. This is needed because the state does not directly hold on the loader, but instead it holds on a value source for the SourceText or SyntaxTree. It's not part of DocumentAttributes as we would need to keep the attributes in sync with the value of the loader.

Fixes https://github.com/dotnet/roslyn/issues/51993 Pre-req for fix of https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1567043

Bans use of APIs in Roslyn whose implementations create SourceText but do not have a checksum algorithm parameter and always use the default (SHA1). In Roslyn product code we should be explicit about what checksum algorithm is used.

Allow usage of APIs listed in eng\config\BannedSymbols.txt in tests without warnings. These APIs would ideally not be used at all but usage in tests doesn't matter much. In fact, we still need to test these APIs even though we don't want them used in product code. Avoiding usage of APIs banned in this PR in all tests is not worth the significant effort it would take to clean up all tests.

tmat avatar Jul 21 '22 18:07 tmat

@jasonmalinowski @dotnet/roslyn-compiler PTAL

tmat avatar Jul 27 '22 02:07 tmat

Found some issues and redoing the representation of checksum alg in Document snapshot.

tmat avatar Jul 28 '22 00:07 tmat

@333fred can you do a second review on compiler changes here?

jaredpar avatar Aug 04 '22 18:08 jaredpar

@tmat: we may also want to update https://github.com/dotnet/roslyn/blob/4a1f4d37f0d68fa71e290d1a0efae5ace6ee5edd/src/Compilers/Core/MSBuildTask/Microsoft.CSharp.Core.targets#L188 and the equivalent bit for VB here. Even though we don't have to reparse a tree because of the checksum algorithm changing, it'd still mean less recreating of DocumentStates and potentially data that relies on tree identity (like some caches).

jasonmalinowski avatar Aug 05 '22 18:08 jasonmalinowski

@jasonmalinowski Re CommandLineArgsForDesignTimeEvaluation: Yes, I plan on following up with that change in another PR.

tmat avatar Aug 05 '22 19:08 tmat

/azp run

tmat avatar Sep 01 '22 17:09 tmat

Azure Pipelines successfully started running 4 pipeline(s).

azure-pipelines[bot] avatar Sep 01 '22 17:09 azure-pipelines[bot]

@tmat: one set of tests I didn't see anywhere (unless I missed it) was some tests around solution sync to verify that changing the checksum actually worked right.

jasonmalinowski avatar Sep 22 '22 22:09 jasonmalinowski

@tmat I'd also expect one or two tests on VisualStudioProject directly ensuring the end-to-end actually reading a file from disk works, since that'll actually ensure much of this works in the actual product.

jasonmalinowski avatar Sep 22 '22 22:09 jasonmalinowski

@sharwell

We have more than one BannedSymbols.txt file to handle cases where we only want them excluded in a subset of scenarios. The one you refer to here is the one that covers everything (including tests).

Yes, we have one that only applies to test projects and this one that applies to all projects. This PR changes the latter to only apply to production code.

tmat avatar Sep 26 '22 18:09 tmat

Still needs a fix for passing in compilation translations and tests to cover that. I'd also request a few tests for solution sync because if we screw that up it'll be hard to understand more fully.

Done.

I'm not sure why DocumentInfo needs to hold onto the checksum, since that should just come from the project. No different than DocumentInfo not needing parse options, etc -- the only time a document's text will get processed is in the context of a project, so we can just use it. Given our time contraint, I'm entirely content if we just make the constructor that takes it and the property that reads it internal and we can either [a] mop-up it later or [b] convince ourselves we actually need it and then document what it controls and how it interacts with the project level setting.

Removed.

Also not entirely sure if we need the text loader CanReload or not; similarly happy to make that internal if unsure.

I think it is necessary to make LoadTextAndVersion implementations reasonable. Most implementations will need to ignore the TextLoadOptions since they can't recreate the SourceText from binary repr. In that case they need to return false from CanReloadText, otherwise we end up with inconsistency between updated SourceTexts and the properties of the document.

tmat avatar Sep 26 '22 20:09 tmat

/azp run

tmat avatar Oct 10 '22 16:10 tmat

Azure Pipelines successfully started running 4 pipeline(s).

azure-pipelines[bot] avatar Oct 10 '22 16:10 azure-pipelines[bot]