msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

enable out of process execution of inline tasks

Open JanProvaznik opened this issue 6 months ago • 10 comments

Fixes #11914

Context

Currently all 3 MSBuild shipped Inline task factories (RoslynCodeTaskFactory, CodeTaskFactory, XamlTaskFactory) work this way: inproc they initialize and create+load an in memory assembly (somehow) containing an ITask. Then since execution is in the same process it can instantiate it.

Changes Made

Strategy for porting is to create the assembly on disk instead (still inproc) and tell TaskHost node where load it. and on endbuild clean disk assemblies.

strategy for cleanup: create the dlls in a directory of the form MSBuildTemp/InlineTaskDllSubPath/pid_{process_NUM} after build try to delete all directories in InlineTaskDllSubPath, those that are in use (by this process or other active build processes) won't be deleted because Assembly.LoadFrom locks them

Testing

existing tests +same tests with the force out of proc envvar

Notes

Alternatives I considered: memory mapped file of the assembly - seems much more annoying to architect how to pass this (currently the change is quite minimal), but would make the disk assembly file management cleaner Initialize also out of proc - ✖ no need for isolation, hard to rearchitect

Using this in theory regresses perf - the dll is loaded at least twice + taskhost overhead (https://github.com/dotnet/msbuild/issues/11335 sidecar taskhost helps but does not mitigate completely ) Not using this via the feature flag is perf neutral

having the assemblies on disk makes https://github.com/dotnet/msbuild/issues/10164 easier having the assemblies on disk is risky because if the process crashes, we don't clean them up and they could add up

For user defined custom task factories probably not possible backwards compatible and enable out of proc execution, the assemblies can't travel across process. https://github.com/dotnet/msbuild/issues/11945 I'd consider 1. removing support for them 2. specifying where they should create the assembly dll so it's then predictably cleaned up (make the OutOfProcTaskFactory interface public with the expectation that the assemblies are created the same way as shipped inline taskfactories)

JanProvaznik avatar Jun 03 '25 16:06 JanProvaznik

"send the information about the task to the host process and do the older process there" is more what I was expecting--can you elaborate a bit more on the cons of that approach?

rainersigwald avatar Jun 05 '25 13:06 rainersigwald

"send the information about the task to the host process and do the older process there" is more what I was expecting--can you elaborate a bit more on the cons of that approach?

  • complexity of implementing node communication about Factory initialization, the TaskHostTask wrapping was there already, once the task is in a dll it's just as if we had any other task
  • the initialization (compilation) happens during the search for the task in TaskRegistry and it would be hard to keep in sync with how it currently logs errors when it is sent to a different process
  • I think caching assemblies would need sidecar taskhost first

JanProvaznik avatar Jun 05 '25 14:06 JanProvaznik

For the benefits of suggested approach, I will add that even with sidecar taskhosts, compilation of the task assembly in the worst case would be needed on each of those. And compilation overhead is quite big - around 2 seconds according to @JanProvaznik's data. Having it compiled and then location cached in the main node minimizes that overhead.

We are also nicely reusing current infrastructure - that is the minimal additional complexity mentioned by @JanProvaznik

The drawback is obvious - instead of creating a temp file for maximum mere seconds we create it for the duration of the build. And if the process is killed, the temp files would not be cleaned up - that's the main risk.

AR-May avatar Jun 06 '25 08:06 AR-May

I'm not sure how severe the risk is, how often a nongraceful process exit happens and what would be the expectation of disk overuse when you're building on a dev machine X times per day...

a maybe silly idea: probabilistic deep cleanup

  • 1/100 before exiting look if there are any other VS/msbuild.exe/dotnet processes running, if no delete the parent directory of the process's dll temp directory (which removes all dlls left over by prior processes)

I am very open to suggestions how to manage these files another way.

data: a trivial inline task dll is 4kb

JanProvaznik avatar Jun 06 '25 09:06 JanProvaznik

new disposal idea thx @AR-May @rainersigwald : I have to test it if the OS API really works this way and the extent to which this works on *nix

in the Factory initialization open a file handle for the compiled dll with a flag FileOptions.DeleteOnClose, save it in the factory's property so it's not GCd, the flag is supposed to mark the file for deletion on the OS (for windows)/CLR (for *nix) level. Once the process dies the file handle dies and the file is deleted. (on *nix it's best effort because if you kill -9 the process the CLR doesn't have time to do cleanup) Nice theory...

JanProvaznik avatar Jun 06 '25 16:06 JanProvaznik

new disposal idea thx Alina, Rainer: I have to test it if the OS API really works this way and the extent to which this works on *nix

in the Factory initialization open a file handle for the compiled dll with a flag FileOptions.DeleteOnClose, save it in the factory's property so it's not GCd, the flag is supposed to mark the file for deletion on the OS (for windows)/CLR (for *nix) level. Once the process dies the file handle dies and the file is deleted. (on *nix it's best effort because if you kill -9 the process the CLR doesn't have time to do cleanup) Nice theory...

not so simple, because how we load in taskhost nodes now we use Assembly.Load and the DeleteOnClose requires that the file is opened with the FILE_SHARE_DELETE which Assembly.Load does not expose api for doing that.

Possible workaround for that is detecting the "temporary dll file" situation and loading first to memory by opening with the flag and then loading from memory.

Which needs more logic to pass the path... because the location of the in memory assembly is empty again.

JanProvaznik avatar Jun 09 '25 12:06 JanProvaznik

complication: the references of the inline tasks have to be somehow loaded on the taskhost and they could be arbitrary dlls, this resolve reference logic is nontrivial LoC in each factory

JanProvaznik avatar Jun 10 '25 15:06 JanProvaznik

So the inline tasks can have references. I glossed over that initially thinking the resolution info would be in the dll so calling Assembly.Load on the dll with the task would be enough. Apparently it does not work that way, (also CodeTaskFactory and RoslynCodeTaskFactory do it differently). So there is a bunch of logic for this reference resolution and loading that's not cross-process compatible.

brainstorming how to proceed: 1.  reconstruct the loading logic in taskhost, the resolution before compilation in main proc would place a metadata file next to the dll that is loaded by the taskhost telling it how to load the dll. Sounds nice in theory, not sure how the implementation will go...   1b. implement it in node communication instead of a file? 2. copy the referenced dlls next to the temp dll which would lead to them being found when loading the task dll. Increases the severity of disk space leaks. 3. reconsider the decision to split the initialization and execution -> compile and execute in the same sidecar taskhost and then make sure the taskhost is reused for instances of execution to avoid recompilation. I am dreading to rearchitect this and depends on sidecar.

I am inclined to start with 1. Though I am afraid of what more roadblocks I'll find.

JanProvaznik avatar Jun 10 '25 16:06 JanProvaznik

Note: currently when you declared task parameters with output/required in xml instead of in code the code that is generated did not contain them and it relied on some. TaskHost can't do that, so I added the attribute generation. I think it's more correct now, but don't know if there could be any consequences.

JanProvaznik avatar Jun 16 '25 11:06 JanProvaznik

Note: XamlTaskFactory has no test coverage for runtime of tasks, and the test coverage of RoslynCodeTaskFactory/CodeTaskFactory is not great. It's good that this is opt-in.

JanProvaznik avatar Jun 16 '25 11:06 JanProvaznik

IMO this is done and waiting for the threat model review, if the sidecar taskhost pr is reintroduced it would need adjustment again (I reverted it when sidecar taskhost was reverted)

JanProvaznik avatar Jul 18 '25 12:07 JanProvaznik

New attempt to move forward: load as bytes, workaround the location in the assembly communication infrastructure. This is supposed to allow deleting the temporary dll on build end which is cleaner

JanProvaznik avatar Aug 22 '25 14:08 JanProvaznik

ready for another review after major changes @AR-May

JanProvaznik avatar Sep 30 '25 14:09 JanProvaznik

I'm creating 2 exp insertions: one for validating this does not break VS when off - passed one for validating impact on VS when the trait is enabled - expected error when loading xamltaskfactory from tasks.core.v4.0.dll (strategy what to do in that case TBD)

JanProvaznik avatar Oct 02 '25 16:10 JanProvaznik