sdk icon indicating copy to clipboard operation
sdk copied to clipboard

Installer to SDK merge - Phase 1

Open MiYanni opened this issue 1 year ago • 4 comments

Sibling PR: https://github.com/dotnet/installer/pull/17959

[!NOTE] This PR will continue to have merge conflicts until a manual sync is done, which involves updating the sibling Installer branch and this branch at the same time.

Summary

These are the "Phase 1" changes for merging the Installer repo into this repo. A majority of this PR is new files. Those files are generally the same as the ones seen in the sibling PR in the Installer repo. There are some changes necessary for them to work in this repo (such as path changes and that this repo uses Central Package Management). So, any new files are best reviewed in that PR.

For the changes to the files in this repo, some adjustments were necessary, such as changes to the build or CI. For example, passing -sign to the non-Windows jobs in CI will not function now as signing was updated to include the Installer assets and signing logic from the Installer repo. Arcade currently doesn't support non-Windows signing anyway.

The high-level strategy was to simply get the Installer build working within the SDK repo. There has been no consolidation done to the build, as that is part of Phase 2. Basically, the Installer build (redist-installer) happens after the normal SDK build (redist) occurs. The information below should give some context and strategy for the changes that took place to achieve this.

Project Overview (Additions/Alterations)

  • Added src/Installer which contains the main components of this PR
    • src/Installer/core-sdk-tasks contains a project with MSBuild tasks that facilitate the creation of the SDK installer.
    • src/Installer/finalizer contains a C++ project that creates a small binary used within the SDK installer. It interacts with the WiX SDK.
    • src/Installer/redist-installer contains the project that actually builds the SDK installer. For this PR, this runs after the standard SDK redist in src/Layout/redist.
  • Several projects added/adjusted in test
    • test/EndToEnd.Tests has many additional tests from the Installer repo added to it. However, these new tests were not converted into Helix tests. Therefore, this test project has been disabled for this PR. It will be re-enabled in a follow-up PR.
    • test/Microsoft.DotNet.Tools.Tests.Utilities.Tests was removed. The 2 test source files were actually testing Microsoft.DotNet.Cli.Utils, so these were moved to Microsoft.DotNet.Cli.Utils.Tests.
    • test/Microsoft.DotNet.Tools.Tests.Utilities was added. This seems to be a historical bifurcation of Microsoft.NET.TestFramework. A follow-up PR would combine the contents of this into Microsoft.NET.TestFramework and remove this project. In this PR, this project is only used by EndToEnd.Tests.
    • test/TestAssets was added. This contains test projects used by both Microsoft.DotNet.Tools.Tests.Utilities and EndToEnd.Tests. In a follow-up PR, these will likely be within the EndToEnd.Tests project folder since they'll only apply to that.
    • test/core-sdk-tasks.Tests was added. This contains tests for the core-sdk-tasks. Only contains 2 test source code files.

Test Builds

  • Public
    • ~~https://dev.azure.com/dnceng-public/public/_build/results?buildId=568848~~
    • ~~https://dev.azure.com/dnceng-public/public/_build/results?buildId=584811~~
    • https://dev.azure.com/dnceng-public/public/_build/results?buildId=588514
  • Internal
    • ~~https://dnceng.visualstudio.com/internal/_build/results?buildId=2380529~~
    • ~~https://dnceng.visualstudio.com/internal/_build/results?buildId=2391019~~
    • https://dnceng.visualstudio.com/internal/_build/results?buildId=2393481

FAQ

What is "Phase 1" and why have more than one phase?

"Phase 1" is part of a 2-phase approach. This only involves getting the logic necessary to build the SDK installer into the SDK repo. This asset will not be used in production. The Installer repo will still be used as it is currently functioning. The reason for this phase is to verify we can maintain building the Installer bits in the SDK repo. It also allows for piecemeal changes that work toward the completion of Phase 2. Otherwise, trying to accomplish all of these changes at once would be a massive (months-long) endeavor with high risk. This helps reduce risk and allows functionality to "turn on" one-at-a-time.

What is "Phase 2"?

The definition of "done" for this entire endeavor will be at the end of "Phase 2". This involves:

  • Disabling the main branch in the Installer repo
  • Redirecting all CI (SourceBuild, Staging, etc.) to use the SDK repo
  • Combining the redist process into a single, cohesive project (combine redist and redist-installer into one)
  • Removing the logic for the staged SDK build to produce a "pseudo" SDK and instead, using the actual produced SDK from the build
  • Allow the production of the installer to be togglable in the CI build
  • Verifying all pipelines produce the correct artifacts and publish to the right locations
  • Verify Installer continues to work properly for all .NET servicing branches prior to .NET 9

These changes are able to be worked on piecemeal (individual PRs) once the Phase 1 PR is merged.

Will this negatively affect SDK build times?

Currently, yes. Without consolidation, after this PR is merged, it will increase the build time of the SDK repo since the Installer is always built. This is done so we can (currently) maintain the ability to build the installer within this repo, as code flows and other changes occur. As part of the Phase 2 changes, we can decide when it makes sense to build the installer and adjust accordingly.

What CodeFlow dependencies were added?

Here are the CodeFlow changes that will need to take place when this PR is merged. I've ordered them by repo and the sub-items are the different packages from that repo.

  • https://github.com/dotnet/arcade
    • Microsoft.DotNet.Build.Tasks.Installers
    • Microsoft.DotNet.CMake.Sdk
  • https://github.com/dotnet/arcade-services
    • Microsoft.DotNet.Darc
    • Microsoft.DotNet.DarcLib
  • https://github.com/dotnet/aspire
    • Microsoft.NET.Sdk.Aspire.Manifest-9.0.100-preview.1
    • Microsoft.SourceBuild.Intermediate.aspire
  • https://github.com/dotnet/core-setup
    • NETStandard.Library.Ref
  • https://github.com/dotnet/runtime
    • Microsoft.Extensions.Logging.Console
  • https://github.com/dotnet/scenario-tests
    • Microsoft.DotNet.ScenarioTests.SdkTemplateTests
    • Microsoft.SourceBuild.Intermediate.scenario-tests
  • https://github.com/dotnet/test-templates
    • Microsoft.DotNet.Test.ProjectTemplates.2.1
    • Microsoft.DotNet.Test.ProjectTemplates.5.0
    • Microsoft.DotNet.Test.ProjectTemplates.6.0
    • Microsoft.DotNet.Test.ProjectTemplates.7.0
    • Microsoft.DotNet.Test.ProjectTemplates.8.0
    • Microsoft.DotNet.Test.ProjectTemplates.9.0
    • Microsoft.SourceBuild.Intermediate.test-templates
  • https://github.com/dotnet/winforms
    • Microsoft.Dotnet.WinForms.ProjectTemplates
  • https://github.com/dotnet/wpf
    • Microsoft.DotNet.Wpf.ProjectTemplates

What is the strategy for keeping the Installer code in this repo up-to-date during Phase 1?

Since Phase 1 still relies on the Installer repo, regular 'syncs' need to take place. Currently, I'm hoping that a once-a-week sync will be sufficient. I'll keep a standing branch of "previous-main" in Installer and do a flat comparison against current "main". I'll make a draft PR to see the flat changes that have taken place over the week. Then, I manually make those updates to the files in this repo. Lastly, I update "previous-main" to the commit from "main" for which I did the comparison. That process will continue to happen until the completion of Phase 2.

How will Installer updates be handled for servicing branches?

Until all the servicing branches prior to .NET 9 go out-of-support, changes from those branches need to manually be recreated in the SDK repo. This means that, for us to accept Installer servicing branch changes, those PRs will require a sibling PR be created in the SDK repo by the PR author. There is no simple way to do this automatically as the file paths in the repos and the contents can be different depending on the changes required. You cannot simply "apply" the commit from the one repo into the other one. We'll continue to monitor if this becomes a significant problem and adjust the strategy accordingly. Also note that the files in this repo (SDK) will likely dramatically divert from the ones in the Installer repo during the Phase 2 changes.

MiYanni avatar Feb 15 '24 18:02 MiYanni

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

Just a general question on the direction here. For the code that gets ported over from installer into sdk you will include git history with all the relevant commits, right?

ViktorHofer avatar Mar 06 '24 09:03 ViktorHofer

@ViktorHofer

Just a general question on the direction here. For the code that gets ported over from installer into sdk you will include git history with all the relevant commits, right?

No. That wasn't in the plan. If the Installer repo was an independent component of the SDK (similar to dotnet-format) and wanted to merge into the repo, preserving the history isn't so bad. But, Installer and SDK had a split sometime in the past. So, there's actually a lot of code that is similar/same between the two.

After this PR, I'll be deeply integrating this code into the SDK logic (specifically, the testing utilities and the redist). The goal is that the Installer is just "an option" as part of the SDK build process, so it won't really exist as an independent thing.

We decided since it'll change form so dramatically, the Installer repo will (eventually) be archived and all the history can remain there. The Installer build process in this repo should produce the same assets, but likely the majority of the changes to it will already take place in this repo once this PR is merged.

One thing I wasn't sure about was the approach that https://github.com/dotnet/cli took when being migrated here. But I haven't found a need to dig into that history yet.

MiYanni avatar Mar 06 '24 17:03 MiYanni

When we merged corefx, coreclr, core-setup and mono into dotnet/runtime, we brought history along. Same is true for cli, core-sdk and other repos that got merged into dotnet/sdk. Unless there's a really strong reason for it, I would be worried if we don't bring version control information along when merging installer into sdk. cc @jkotas for opinions

ViktorHofer avatar Mar 07 '24 14:03 ViktorHofer

When we merged corefx, coreclr, core-setup and mono into dotnet/runtime, we brought history along. Same is true for cli, core-sdk and other repos that got merged into dotnet/sdk. Unless there's a really strong reason for it, I would be worried if we don't bring version control information along when merging installer into sdk. cc @jkotas for opinions

I hear what you're saying. We took a look at installer and determined there wasn't enough history we wanted to save that was worth doing this merge in the more challenging manner. If git allowed for partial moves of the branch while maintaining history, we could have gone that way but I'm not aware of a way of maintaining history without a brute force take merge everything approach. Open to hearing more feedback here or if there is a way to do this that we're not thinking of.

marcpopMSFT avatar Mar 07 '24 17:03 marcpopMSFT

If git allowed for partial moves of the branch while maintaining history, we could have gone that way but I'm not aware of a way of maintaining history without a brute force take merge everything approach. Open to hearing more feedback here or if there is a way to do this that we're not thinking of.

That's actually possible and is what we've done when migrating the other repositories. These days, git-filter-repo is the go to solution. Years ago it was git filter-branch which isn't recommended to use anymore.

With git-filter-repo, the command looks something like git filter-repo --path src/ --to-subdirectory-filter my-module. With that, you only bring along history that affects the passed in --path. Here's the link to the official tool: https://github.com/newren/git-filter-repo

From their own documentation:

extract the history of a single directory, src/. This means that only paths under src/ remain in the repo, and any commits that only touched paths outside this directory will be removed.

ViktorHofer avatar Mar 07 '24 17:03 ViktorHofer

I hear what you're saying. We took a look at installer and determined there wasn't enough history we wanted to save that was worth doing this merge in the more challenging manner.

I'm only asking these questions as I have dealt with repositories that had zero history anymore and the original repository either got deleted or wasn't obvious anymore. We recently had to guess a lot when rewriting the VMR because the original VMR implementation has zero history. Now with the move from installer to sdk we don't want to loose that history yet again and requiring devs to do hops (checking history in a separate repository) isn't a great solution either.

ViktorHofer avatar Mar 07 '24 17:03 ViktorHofer

I hear what you're saying. We took a look at installer and determined there wasn't enough history we wanted to save that was worth doing this merge in the more challenging manner.

I'm only asking these questions as I have dealt with repositories that had zero history anymore and the original repository either got deleted or wasn't obvious anymore. We recently had to guess a lot when rewriting the VMR because the original VMR implementation has zero history. Now with the move from installer to sdk we don't want to loose that history yet again and requiring devs to do hops (checking history in a separate repository) isn't a great solution either.

Might be worth an offline conversation. Installer is fairly small and for the sdk team, there is limited history there we typically look at beyond the tests. However, it has a lot of source build history. The problem I see is with how git is designed, maintaining history seems to trade off against doing the merge in a more responsible piece by piece manner (unless you know of some way to only do a partial merge in with git that I'm not aware of).

marcpopMSFT avatar Mar 07 '24 18:03 marcpopMSFT

cc @jkotas for opinions

I do not have a strong opinion on this. It is up to the SDK team to make the call whether the extra work required to preserve the history is worth it.

I agree that digging history over multiple repos and source control systems, renames, reformats, and history resets is non-trivial. If we drop the history here, it is going to add one more hop to the affected files, but it is not introducing a fundamentally new problem.

maintaining history seems to trade off against doing the merge in a more responsible piece by piece manner (unless you know of some way to only do a partial merge in with git that I'm not aware of).

The tools that @ViktorHofer mentioned are capable of doing this. Running these tools is an extra work compared to just a simple copy&paste.

jkotas avatar Mar 07 '24 19:03 jkotas

Is it possible to scrub the history, e.g. remove things like code flow PRs / commits (everything under eng), but retain important items that introduced product changes, bug fixes (keep everything under src and test)

joeloff avatar Mar 08 '24 21:03 joeloff

Is it possible to scrub the history, e.g. remove things like code flow PRs / commits (everything under eng), but retain important items that introduced product changes, bug fixes (keep everything under src and test)

In theory, yes, but I think this would a huge challenge in figuring out which changes should be scrubbed, and which should be kept. If there were 5 commits, I'd manually classify them and squash all the irrelevant ones into one commit. We'd lose ordering but keep the contents and messages. That's much harder on this scale.

Forgind avatar Apr 19 '24 19:04 Forgind

@Forgind

In theory, yes, but I think this would a huge challenge in figuring out which changes should be scrubbed, and which should be kept. If there were 5 commits, I'd manually classify them and squash all the irrelevant ones into one commit. We'd lose ordering but keep the contents and messages. That's much harder on this scale.

Additionally, these 2 repos were the same repro pre-2018 (at least from what I noticed when merging in src and test history). So, the history gets really awkward to merge back in after 6 years of changes. Trying to figure out "who's history should remain" is not simple because of that shared past. For now, it was easiest to just keep src and test and ignore everything else. If we need more history, I can move it over but this is good enough to continue moving forward with the repo merge.

MiYanni avatar Apr 19 '24 19:04 MiYanni

@dotnet/dotnet-cli

Summary

I've updated this PR to now include the history brought over from the dotnet/installer repo (from this PR). The process involved rewriting the commit history on this PR since the base for the files were in a different location after the history sync. Therefore, this gave me an opportunity to do the commits in such a way to make them easily reviewable. So, start with the commit containing the description, "Moved files from history migration..." and go down the list to review. Several commits will only be renaming/moving files to make them align with our folder structure and build process in this repo. Other commits will be drop-in replacing the files from dotnet/installer with the versions containing that were originally present in this PR. Therefore, those commits are the ones to review, as you'll see the changes needed to be made to make those files work in this repo. Let me know if you have any questions!

New Highlights

  • Made commits easily reviewable
  • Left out the Microsoft.DotNet.Tools.Tests.Utilities.Tests changes (will do those in a separate PR as they weren't related to the repo merge)
  • Unmerged the EndToEnd.Tests and EndToEnd project
    • Instead, EndToEnd.Tests will continue to run in Helix as it currently does. I've added EndToEnd-Installer.Tests which are the ones from the Installer repo and are not currently Helix compatible.
  • This PR originally had some CI fixes (in the YAML files) but I've removed those fixes and already merged them into the repo via these 2 PRs:
    • https://github.com/dotnet/sdk/pull/40213
    • https://github.com/dotnet/sdk/pull/40186

MiYanni avatar Apr 24 '24 20:04 MiYanni

@ViktorHofer

As discussed offline, please undo the file moves for src/SourceBuild and src/VirtualMonoRepository. I assume you will squash your commits when merging this PR?

These folders have now been moved back up to src. I wasn't planning on squashing, no. I hand-crafted the commits in this new PR to align with the actual tasks accomplished. The original version of this PR likely made sense to squash, but here, I'd like to retain the steps taken to do the migration.

MiYanni avatar Apr 25 '24 19:04 MiYanni

@joeloff

Do we have an internal build of -pack with artifacts. Might be good to compare things like the MSI or PKG we built in installer against what SDK will produce.

Last internal build here (I'm also running another one right now): https://dnceng.visualstudio.com/internal/_build/results?buildId=2437998

During Phase 1, I'm not guaranteeing that the output of this Installer build is "correct" since the point of this phase is to have it building in the repo. It is more like a "priming" phase. In Phase 2, I'll be validating many things, including the way that the CI runs between the two repos. I'll be investigating the artifacts and looking at those accuracies. I did an initial comparison the day I got the build working end-to-end locally and didn't see anything different/missing that wasn't expected. I'll do a more in-depth check after the CI has been updated.

MiYanni avatar Apr 25 '24 20:04 MiYanni

FYI, as of this change when I run the sdk build.cmd script locally on my machine I get this error:

cmake not found on path. Please install cmake latest before proceeding. If this is running on a build machine, the arcade-tools directory was not found, which means there's an error with the image.

So there probably needs to be a line added to the developer guide prerequisites stating that cmake needs to be installed and added to the PATH.

Banner-Keith avatar Apr 30 '24 20:04 Banner-Keith

@Banner-Keith Good find. The requirements for this repo would be similar now to what they were from the dotnet/installer repo: https://github.com/dotnet/installer#build-net-installer

I'm going to be making adjustments to how this builds so that the installer is optional. I should think of a way to make it so that if you don't want the installer, you shouldn't need cmake.

MiYanni avatar May 01 '24 17:05 MiYanni

note only windows need cmake for msi stuff and adding these three lines in global.json automates its installation anyway https://github.com/dotnet/installer/blob/dbe6e6a6a74d137b9de1aa6006f4a34450688cb9/global.json#L10C1-L12C5 so please don't make the decision just on cmake behalf (<10sec one-time installation thing)...

kasperk81 avatar May 01 '24 17:05 kasperk81