Overview
To meet our goals of keeping our engineering systems up-to-date and as efficient as possible, we will deliver feature work that reduces the complexity and Azure-spending usage of our systems, while at the same time moving these systems onto long-term supported versions of .NET Core. As part of this work we will build the ability to detect extraneous objects in our subscriptions, allowing us to respond and clean them up. This epic will seek to reduce known, tech-debt-derived complexity and Azure spending for daily operations by the .NET Core Engineering Services team, as well as offload regularly executed processes to the DncEng vendor team, freeing up engineering FTE time.
Consolidating the service fabric clusters we use will permanently save 104 D_v3 cores in Azure, along with the long-term costs of patching, maintaining, and deployment to this cluster. This also means fewer Service 360 issues to respond to when the clusters become out of compliance, and simpler porting (half the work) if we need to move to a different data center or VM size. Moving the autoscaler into the services deployed by the dotnet-helix-service repo will allow deployment of updated build and test images even when the scaler is experiencing issues, making updating build images easier.
Goals
- [ ] Reduce core usage, improve speed of deployment, eliminate patching costs of services provided by DncEng by consolidating two other systems into the same service fabric cluster (104 cores, 20-30 minutes per CI build)
- [ ] Ensure that all .NET Core Engineering services are running on long-term supported versions of .NET Core Framework
- [ ] Understand what we have in our subscriptions, to improve our understanding of what we're spending money on in Azure by building the ability to snapshot what resources we use and compare them to previous points in time
- [ ] Onboard more machine-maintenance tasks to vendor team
Required for Epic
| Deliverable |
Owner(s) |
Completion Date |
Status |
Notes |
| dotnet-helix-service |
@ChadNedzlek |
5/11/2022 |
Complete |
|
| arcade-services |
@ChadNedzlek |
5/04/2022 |
Complete |
|
| dotnet-helix-machines |
@garath |
10/7/2022 |
Not Started |
|
| dotnet-migrate-package |
@garath |
8/30/2022 |
Complete |
|
| dotnet-xliff-tasks |
@jonfortescue |
8/17/2022 |
Complete |
|
| dotnet-source-indexer |
@jonfortescue |
8/30/2022 |
Complete |
|
| Deliverable |
Owner(s) |
Completion Date |
Status |
Notes |
| .NET Core Engineering team has no known leakage of objects that can continuously grow (images, storage size, etc) |
@MattGal |
4/15/2022 |
Complete |
- |
| The DncEng team can run a tool and document all "interesting" (billable) objects in their subscriptions in a simple, reliable manner. |
@garath |
TBD |
Not Started |
- |
| Process documentation exists and has been handed off to vendor for periodic evaluation and reporting. |
@garath |
TBD |
Not Started |
- |
| Deliverable |
Owner(s) |
Completion Date |
Status |
Notes |
| Port autoscaler service fabric to dotnet-helix-services cluster |
@MattGal |
9/10/2022 |
Completed |
|
| Port VM cleaner to dotnet-helix-services cluster as web job |
@MattGal |
9/21/2022 |
In progress |
|
| Decommission old resources once moved |
@MattGal |
10/1/2022 |
In progress |
|
| Deliverable |
Owner(s) |
Completion Date |
Status |
Notes |
| Visual Studio version update tick-tock onboarded to vendor team |
@jonfortescue |
09/30/2022 |
In progress |
|
| Azure Gallery base image update version work onboarded to vendor team |
TBD |
TBD |
Not Started |
|
| Vendor team can generate rollout release notes |
TBD |
TBD |
Not Started |
|
| Vendor team can generate rollout pull requests |
TBD |
TBD |
Not Started |
|
Completed
| Deliverable |
Owner(s) |
Completion Date |
Status |
Notes |
Recently Triaged Issues
All issues in this section should be triaged by the v-team into one of their business objectives or features.