project-system icon indicating copy to clipboard operation
project-system copied to clipboard

Avoid downloading MSBuild SDKs from NuGet on the VS UI thread

Open dsplaisted opened this issue 4 years ago • 16 comments

The MSBuild NuGetSdkResolver may download NuGet packages during MSBuild evaluation, which can happen on the UI thread in VS. Some discussion of this: https://github.com/microsoft/msbuild/issues/4025

We are adding some features to MSBuild SDK Resolvers to support optional SDK workloads for .NET 5. This will allow SDK resolvers to:

  • Return any number of SDK paths (zero, one, or many). It will be possible to return a successful result that returns no SDK paths.
  • Return MSBuild items and properties to add to the evaluation result

These features will also allow us to improve the NuGetSdkResolver experience in VS. How this could work is outlined in the "NuGet SDK Resolver" section of this designs PR: https://github.com/dotnet/designs/pull/104

The proposal is this: By default, the NuGet SDK resolver would continue to work as it does today. However, in Visual Studio it would be set to a mode which would disable acquisition of the NuGet packages. In this mode, if an SDK NuGet package wasn't already available locally, the resolver would not download the package, but would add a MissingMSBuildSDK item with the name and version of the SDK that needs to be downloaded.

The project system would check for MissingMSBuildSDK items after the project is evaluated. If there are any, the project system would not load the project normally. It would launch an async acquisition process to download the NuGet packages, while showing appropriate UI (for example a spinning progress bar, or "loading..." text by the project in Solution Explorer). When the package acquisition finishes, the project system would reload the project.

We may need to add the ability for resolvers to read global properties in order for VS to communicate to the NuGetSdkResolver that it should run in this different mode.

dsplaisted avatar Apr 07 '20 17:04 dsplaisted

I would much prefer the communication to be something as simple as:

  • We get a list of missing packages
  • We pass these to an API (from NuGet) and it downloads it.

We can handle the prompt, etc, but using the project evaluation as side-effect of acquiring the packages would be very hard to coordinate with the solution and the project load life cycle.

davkean avatar Apr 07 '20 23:04 davkean

@dsplaisted Given SDKs can refer to other SDKs, how will MSBuild be able to provide a complete set of MissingMSBuildSDK items to avoid the load project -> prompt -> load project -> prompt?

davkean avatar Apr 07 '20 23:04 davkean

@davkean might be best for the user if the prompt cleared itself once everything's loaded, regardless of how many downloads occur. The user shouldn't have to interact with the message.

drewnoakes avatar Apr 08 '20 23:04 drewnoakes

I discussed this with @davkean yesterday. The proposal is that MSBuild would provide a list of SDKs to download. There are some cases where we couldn't know the full dependencies on the first evaluation. This can happen if a NuGet SDK imports another NuGet SDK, or if a NuGet SDK ends up adding a dependency on an optional workload that would need to be installed via in-product acquisition.

As Drew notes, the experience is mitigated if there isn't a prompt the user has to click through in order to download NuGet packages.

Dave suggested that a cheaper first step would be to fail the load in VS if there were unresolved NuGet SDKs, with a message that the project needs to be built from the command line first in order to restore the packages, so that VS can open the project.

dsplaisted avatar Apr 08 '20 23:04 dsplaisted

Triage: @davkean is this something that you should own or should we give it to someone else?

jjmew avatar Apr 14 '20 00:04 jjmew

I am following up with @marcpopMSFT

jjmew avatar Apr 21 '20 00:04 jjmew

The idea is that we will block it and ask the user to use the command line to restore. Later on we might do something beyond that.

jjmew avatar May 15 '20 00:05 jjmew

@swesonga Could you work on this for 16.9?

jjmew avatar Oct 06 '20 15:10 jjmew

If there is only time for the error message in 16.9, we should collect some feedback from internal folks so I started a mail thread. That way we can determine if customers would rather a UI hang versus an error message and command line restore. @KathleenDollard for feedback as well.

marcpopMSFT avatar Oct 07 '20 21:10 marcpopMSFT

Note: this full on hung VS for me, forcing me to force-quit it.

CyrusNajmabadi avatar Aug 03 '21 23:08 CyrusNajmabadi

@melytc - Please feel free to sync with @jeffkl on this item.

aortiz-msft avatar Feb 25 '22 01:02 aortiz-msft

Per @jeffkl and @melytc many of the worst problems have been improved here and Jeff continues to make improvements. We're still going to leave this active to consider moving the download off of the UI thread but it may not be as critical anymore.

marcpopMSFT avatar Mar 16 '22 21:03 marcpopMSFT

@marcpopMSFT Evaluation is already done off the UI thread in CPS, so moving the download off the UI thread does not resolve the problem. However, VS in lots of circumstances, needs to block on evaluation from the UI thread and that evaluation hitting the network is the problem.

@jeffkl How have you determined the worst problems have been improved here?

davkean avatar Mar 16 '22 21:03 davkean

How have you determined the worst problems have been improved here?

I have made some big improvements around the whole scenario. There's a lot more going on than just downloading a package. For each design-time build, a global.json is searched for and deserialized, NuGet.Config is loaded, then the disk is searched for an existing package, then finally a package is downloaded. Also, MSBuild's SDK resolution APIs were launching a lot of threads to handle out-of-proc design-time builds (evaluations) in the main node which could lead to thread starvation. I also moved the NuGet-based MSBuild project SDK resolver down in priority so that the .NET SDK resolver runs first in case its just a built in SDK being resolved.

I have more work to do in this area so don't think I'm done improving things.

https://github.com/NuGet/Home/issues/11441

jeffkl avatar Mar 17 '22 16:03 jeffkl

I've seen that, but most traces I've seen around this are caused by the network hit. For example, for the original bug against msbuild, the symptoms that led me to file was that the corefx solution was taking 15 minutes to open - what data are we using to determine that the underlying problem is resolved?

davkean avatar Mar 17 '22 19:03 davkean

Did it take 15 minutes to download a package? I'm assuming that there was some thread starvation going on where a ton of threads were all trying to resolve an SDK and it took a lot longer to get a nupkg.

I have never been able to reproduce the issue. I've generated a large solution that references five different SDKs. I've injected test SDK resolvers that have sleeps, and I've tried clone repos that exhibit the problems but I have never been able to get it to happen live for me to debug.

I am not making wild claims that its all fixed. I have been making changes to the whole stack that should improve the experience that should make a difference. Do you have time to help me get some data that I can use to make meaningful comparisons?

jeffkl avatar Mar 17 '22 19:03 jeffkl