arcade
arcade copied to clipboard
AzDO Networking issue impacting multiple builds
Issue for tracking the intermittent, inconsistent networking errors we're encountering in our builds.
https://portal.microsofticm.com/imp/v3/incidents/details/292951370/home
{
"errorMessage" : "net/http: request canceled while waiting for connection"
}
Report
Build | Definition | Step Name | Console log |
---|---|---|---|
17054 | dotnet/roslyn | Initialize containers | Log |
Summary
24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
---|---|---|
0 | 0 | 1 |
I have created Azure support ticket 2203110010001681 to try and help address this.
Is there perhaps plans for a public version of the link above so we have a general idea on what the cause of the issue might be?
good news it seems my pr is now unblocked by this: https://github.com/dotnet/arcade/pull/8604
Hi @AraHaan - I don't think there's a way to provide public access to our internal issue tracking system. :( In this case, the title kinda says it all. We're having intermittent issues w/ connectivity and the Azure folks are trying to get to the bottom of it.
Great to see your PRs seems to have worked!
@adiaaida and/or @mmitche - would one of you mind taking a look at the PR? (looks good to me)
I don't have context on the file being changed, so hopefully Matt can look?
I am adding some new cases affected by this error.
- Installer Build and Test coreclr Linux_arm64 Release, Pipelines - Run 20220314.2. error:
docker run -v /mnt/vss/_work/1/s:/root/runtime -w=/root/runtime -e VSS_NUGET_URI_PREFIXES -e VSS_NUGET_ACCESSTOKEN mcr.microsoft.com/dotnet-buildtools/prereqs:rhel-7-rpmpkg-c982313-20174116044113 ./build.sh --ci --subset packs.installers /p:BuildRpmPackage=true /p:Configuration=Release /p:TargetOS=Linux /p:TargetArchitecture=arm64 /p:RuntimeFlavor=coreclr /p:RuntimeArtifactsPath=/root/runtime/artifacts/transport/coreclr /p:RuntimeConfiguration=release /p:LibrariesConfiguration=Release /bl:artifacts/log/Release/msbuild.rpm.installers.binlog
Unable to find image 'mcr.microsoft.com/dotnet-buildtools/prereqs:rhel-7-rpmpkg-c982313-20174116044113' locally
docker: Error response from daemon: Get "https://mcr.microsoft.com/v2/": net/http: request canceled while waiting for connection
- Installer Build and Test coreclr Linux_musl_x64 Release, Pipelines - Run 20220314.22, Mono Product Build Linux x64 debug Run 20220316.68 error:
docker: error pulling image configuration: Get "https://westus2.data.mcr.microsoft.com/01031d61e1024861afee5d512651eb9f-h36fskt2ei//docker/registry/v2/blobs/sha256/d3/d3358c58cff96d0874e70d9ef680e5c69a452079d7d651f9e441c48b62a95144/data?se=2022-03-14T18%3A52%3A56Z&sig=7CM6Q6E1lL%2F07ifd%2FR1VVO%2BRlBbCH%2FiCs8V%2Fki%2BvxXE%3D&sp=r&spr=https&sr=b&sv=2016-05-31®id=01031d61e1024861afee5d512651eb9f": dial tcp 131.253.33.219:443: i/o timeout.
- Build Android arm Release AllSubsets_Mono, Pipelines - Run 20220314.1, Build Browser wasm Linux Release LibraryTests_EAT, Pipelines - Run 20220315.4, Build Linux x64 Release AllSubsets_Mono_LLVMJIT Run 20220316.68. error:
Error response from daemon: Get "[https://mcr.microsoft.com/v2/"](https://mcr.microsoft.com/v2/%22): net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
##[error]Docker pull failed with exit code 1
Are we tracking the numerous MCR download failures here or somewhere else❔ That problem seems to be getting worse instead of better e.g.
- https://dev.azure.com/dnceng/internal/_build/results?buildId=1664939&view=results
- https://dev.azure.com/dnceng/internal/_build/results?buildId=1665084&view=results
- https://dev.azure.com/dnceng/internal/_build/results?buildId=1665221&view=results
- https://dev.azure.com/dnceng/internal/_build/results?buildId=1665434&view=results
Per information from @agocke , there is a problem with the CDN behind MCR.
After doing some additional research, we could not find any recent instances of this error. Should it occur again, we will open another issue with the MCR team
@ilyas1974 https://github.com/dotnet/roslyn/pull/56162/checks?check_run_id=5776024591
reopening as new instance of this issues are happening such as:
- https://dev.azure.com/dnceng/public/_build/results?buildId=1752769&view=logs&j=174348cd-3455-59d8-c7f7-32c969b0807d&t=eda8acff-ca02-497c-b15e-87884972117b&l=29
Do y'all think this should be a known build error, or marked as critical? My impression is that the hit count is low enough that it doesn't meet the critical bar....but not sure if we have a bar yet.... :) Thoughts @ilyas1974 ?
We set a bar of 200 jobs impacted before we engage a partner team (metric review back in May).
As there have not been any instances of this issue for the last 7 days, I am closing this issue.