roslyn icon indicating copy to clipboard operation
roslyn copied to clipboard

Optimize CopyRefAssembly task by checking file size and timestamp before MVID extraction

Open Copilot opened this issue 2 months ago • 6 comments

Summary

This PR optimizes the CopyRefAssembly MSBuild task to significantly improve performance on incremental builds by adding a fast-path check before expensive MVID extraction.

Problem

As reported in the issue, the CopyRefAssembly task was taking considerable time even on incremental builds. The task runs for every TFM of every C# project, and was always extracting and comparing MVIDs from both source and destination assemblies, even when the files were identical.

Solution

Following @jjonescz's suggestion, this PR implements a size and timestamp check before MVID extraction, similar to how MSBuild's standard Copy task works:

  1. Fast path: If both file size and last write timestamp match, the copy is skipped immediately
  2. Fallback: If size or timestamp differ, the existing MVID-based comparison continues as before
// Fast path: check size and timestamp first to avoid expensive MVID extraction
var sourceInfo = new FileInfo(SourcePath);
var destInfo = new FileInfo(DestinationPath);

if (sourceInfo.Length == destInfo.Length &&
    sourceInfo.LastWriteTimeUtc == destInfo.LastWriteTimeUtc)
{
    Log.LogMessageFromResources(MessageImportance.Low, "CopyRefAssembly_SkippingCopy1", DestinationPath);
    return true;
}

Benefits

  • Performance: Avoids opening files and reading assembly metadata when files haven't changed
  • Scalability: The optimization compounds across all TFMs and projects in a build
  • Correctness: Maintains the same behavior when files actually differ - MVID checking still occurs when needed
  • Safety: Wrapped in try-catch to gracefully fall back to MVID checking if any errors occur

Performance Validation

Created a dedicated BenchmarkDotNet project (src/Tools/MSBuildTaskBenchmarks) to measure the performance improvement. Benchmark results demonstrate:

Method Mean Allocated
Size and Timestamp Check (Fast Path) 4.141 μs 472 B
MVID Extraction (Slow Path) 35.049 μs 8,960 B
Combined Check (Fast Path First) 4.091 μs 472 B

Key Findings:

  • 8.6x faster when files haven't changed (35.0 μs → 4.1 μs)
  • 95% less memory allocated (8,960 B → 472 B)
  • ✅ Zero overhead for the common case when the fast path succeeds

Testing

Added comprehensive test coverage:

  • SourceAndDestinationWithSameSizeAndTimestamp - Validates the new fast-path optimization
  • SourceAndDestinationWithSameMvidButDifferentTimestamp - Ensures MVID checking still works correctly
  • Updated existing tests to use direct timestamp manipulation instead of Thread.Sleep for deterministic and faster test execution

All 7 CopyRefAssembly tests pass (378 tests total in the MSBuildTaskTests suite).

Fixes dotnet/roslyn#80611

Original prompt

This section details on the original issue you should resolve

<issue_title>CopyRefAssembly Task takes quite a long time even on incremental builds</issue_title> <issue_description>Version Used: .NET SDK 9.0.305

Image

When analyzing an incremental build for a partner team, the CopyRefAssembly Task stood out for causing a significant amount of time across the build overall.

This Task is called fully for every TFM of every C# project even on incremental builds, so anything that can be done to optimize it accrues quite nicely. </issue_description>

<agent_instructions>Follow @jjonescz's suggestion - update the CopyRefAssembly Task to check if the size/timestamp of the source and destination files are equivalent in order to short-circuit before performing mvid-based checking. Make sure to update tests to confirm the new behavior.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@jjonescz Looks like CopyRefAssembly task checks MVID of each source/destination assembly pair to determine whether to skip the copy. That seems like it could be improved by checking for size+timestamp of the files first, like Copy msbuild task does.

Fixes dotnet/roslyn#80611


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot avatar Oct 10 '25 15:10 Copilot

@jaredpar FYI

jjonescz avatar Oct 13 '25 09:10 jjonescz

In general, I expect a bot to put label "Area-Compilers" on PRs that touch compiler code. For example: image

But for some reason, Copilot PRs don't seem to be getting this treatment...

jcouv avatar Oct 14 '25 17:10 jcouv

But for some reason, Copilot PRs don't seem to be getting this treatment...

Oh, interesting. Copilot PRs do not have the required pull-requests: write permission, leaving the actions as needing approval. https://github.com/dotnet/roslyn/actions/runs/18410427614

I will investigate this within the next week, as it will apply to all repos using dotnet/issue-labeler. Thanks for letting me know; I logged Pull Request Labeling does not run automatically for Copilot PRs (dotnet/issue-labeler#105).

jeffhandley avatar Oct 15 '25 04:10 jeffhandley

From offline discussion it seems the benefits are not significant, so I'm unsure if we want to continue working on this PR, @baronfel?

These are results @jaredpar measured (using the benchmark from this PR presumably): https://gist.github.com/jaredpar/6651f8555f11232ac3797ccfcac0049c

jjonescz avatar Oct 23 '25 09:10 jjonescz

I still want this, given that it happens on every single build of every single project, even those that are incredibly incremental.

100ms on framework and 30ms on core does add up over time.

baronfel avatar Oct 23 '25 11:10 baronfel

@baronfel Moving this PR to draft as our PR queue is pretty full. There's only minor feedback to address (removing benchmark tests). Feel free to undraft when ready.

jcouv avatar Dec 04 '25 19:12 jcouv