msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

Change incremental build to account for more kinds of state change than timestamps

Open Sarabeth-Jaffe-Microsoft opened this issue 9 years ago • 5 comments

This issue is in the open design phase and is a part of the project.json -> csproj conversion effort.

Incremental build in the face of globs

MSBuild only considers the set of items computed in the current build when deciding when to execute a target. When files are included via wildcard this can create a problem when deleting an input. The output will be up-to-date with all of the existing inputs but should be re-built because the set of inputs has changed.

We could address this by building a state store and using it to compare the current build inputs & outputs against the previous build. In order to accomplish this, we would need to complete the following:

  • [ ] Identify a unique address for a target instance (each target instance needs to be uniquely identifiable and gets its own persisted incremental state).
  • [ ] Figure out directory structure for where to save the caches (machine wide / per project, etc).
  • [ ] Flesh out cache invalidation scenarios (e.g. command line changed, environment variables changed, etc). Stretch goal: re-build when properties change as well as inputs/outputs.
  • [ ] Stretch goal: Unify the multiple incremental implementations in MSBuild.

Closing this as we won't be going with this implementation for now.

Reopening because we should have a bug to track "it'd be nice if we could do incremental build via hashes" and "it'd be nice if we could do incremental build correctly in the face of property changes between builds".

rainersigwald avatar Jun 29 '21 14:06 rainersigwald

We had a meeting and discussed this. Currently there are likely many targets (especially around linker/AOT/publish scenarios) which are not incremental, and we don't think it's feasible to make them correctly incremental without some validation that the target "inputs" are correctly declared.

We think we could address this with a design something like the following:

  • Add TargetDataCache attribute to Target element to tell MSBuild where to store incrementality data. SDK code would use a different file under the intermediate output directory for each target
    • It might be too much overhead to have a separate file for each target if they are all using this feature. In that case we might need to consolidate
  • Inputs to a Target should support both properties and items. Instead of expanding / flattening the values as is currently done, MSBuild would need to preserve the structure (for example which items have which values). Item metadata should also likely be accounted for.
    • For speed, the default mode might be to simply create a hash of the inputs. If so, it would be useful to have a different mode that stored all of the values so that the binlog could list which values changed causing a target to be built fully
  • An option to run MSBuild in "Debug" mode where it will record all property and item reads in a target, and compare that to the inputs that were declared on the target. If they don't match, the declared inputs need to be updated, and MSBuild should either emit the information to the binlog or generate an error.

I'm marking this as needs triage to discuss whether we could schedule this for the .NET 9 timeframe.

dsplaisted avatar May 23 '23 14:05 dsplaisted

For speed, the default mode might be to simply create a hash of the inputs. If so, it would be useful to have a different mode that stored all of the values so that the binlog could list which values changed causing a target to be built fully

If all we know of what the state was before is a hash of the inputs, how could we figure out which of the inputs changed, causing the rebuild?

Forgind avatar Jun 12 '23 21:06 Forgind

An option to run MSBuild in "Debug" mode where it will record all property and item reads in a target, and compare that to the inputs that were declared on the target. If they don't match, the declared inputs need to be updated, and MSBuild should either emit the information to the binlog or generate an error.

I really like this idea, but it also sounds like a kinda heavy lift to me. There's a lot you can do in targets, and we don't have any consolidated way to check whether any particular property or item has been read. That said, if we decide to implement this part, I'd consider simultaneously resolving the issue with erroring if someone tries to set a property then never uses it (i.e., typos).

Forgind avatar Jun 12 '23 21:06 Forgind