Latest images don't allow connections
Describe the bug
Latest .NET images don't seem to allow connections; e.g. accessing main, external NuGet feed.
Which .NET image(s) are you using?
mcr.microsoft.com/dotnet/sdk:8.0.407-windowsservercore-ltsc2019;mcr.microsoft.com/dotnet/sdk:9.0.201-windowsservercore-ltsc2019
Steps to reproduce
Sample image:
FROM mcr.microsoft.com/dotnet/sdk:9.0.201 AS build
WORKDIR /src
# Copy the csproj and restore as distinct layers
COPY *.csproj ./
RUN dotnet restore
COPY . ./
RUN dotnet tool install --global dotnet-sonarscanner
RUN dotnet build
Executing docker build --tag 'testdocker' . --no-cache gets these errors:
=> [4/7] RUN dotnet restore 15.7s => => # /src/dockerdotnet.csproj : error NU1301: Unable to load the service index for source https://api.nuget.org/v3/index.json.
=> => # /src/dockerdotnet.csproj : error NU1301: The SSL connection could not be established, see inner exception.
=> => # /src/dockerdotnet.csproj : error NU1301: The remote certificate is invalid because of errors in the certificate chain: PartialChain
=> => # /src/dockerdotnet.csproj : error NU1301: Unable to load the service index for source https://api.nuget.org/v3/index.json.
=> => # /src/dockerdotnet.csproj : error NU1301: The SSL connection could not be established, see inner exception.
=> => # /src/dockerdotnet.csproj : error NU1301: The remote certificate is invalid
Other information
Summarizing from this thread.
Behavior is consistent. Specifying prior version is a workaround.
Output of docker version
Output of docker info
Are you using Windows or Linux images? You have both listed. The example seems to suggest Linux.
.NET SDK 9.0.202 released yesterday, so give that a shot instead of 9.0.201.
That being said, I could not reproduce this using Linux or Windows, 9.0.201 or 9.0.202 (or 8.0.407).
I don't have the Sonarr project cloned, so I created a new console app using the SDK. I added a package to the project to ensure we reached out to NuGet.org.
ARG SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:9.0.202
# To test on Windows:
# ARG SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:9.0.202-windowsservercore-ltsc2019
# ARG SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:8.0.407-windowsservercore-ltsc2019
FROM $SDK_IMAGE AS build
# Create new console app
RUN dotnet new console -n Demo -o src/ --no-restore
WORKDIR src/
# Add an arbitrary package to ensure we reach out to NuGet.org
RUN dotnet add package Newtonsoft.Json --version 13.0.3
# Restore and build
RUN dotnet restore
RUN dotnet tool install --global dotnet-sonarscanner
RUN dotnet build
@nkolev92 any additional tips you recommend for troubleshooting SSL errors with NuGet restore?
@richlander My team uses the Windows images, specifically the windowsservercore-ltsc2019 tags. I'm not too savvy with Docker, so I'm not sure if the sample image maps to a Linux or Windows image; I'm guessing Linux based on @lbussell's example.
@lbussell I'm seeing the same error on 9.0.202. Should it be dotnet restore --no-cache to make sure it's reaching out?
I have good coworkers:
What is the root cause? Missing Root CA's in the base image's (OS level) Cert Store for LocalMachine.
Seems the root certificates for DigiCert and AAA Certificate Services were removed with the 3/11 releases. Can anyone bring some insight into that change?
@Matthew-Ricks-USBE did you work around this?
@Matthew-Ricks-USBE did you work around this?
Our workaround has been to specify mcr.microsoft.com/dotnet/sdk:8.0.406-windowsservercore-ltsc2019 or mcr.microsoft.com/dotnet/sdk:9.0.200-windowsservercore-ltsc2019 explicitly.
@richlander @lbussell @nkolev92 Have we been able to confirm this is the issue? I'm happy to keep digging if this isn't it, but I'd like some confirmation before going down the rabbit hole.
Hi @Matthew-Ricks-USBE, sorry for the long delay between responses. I reached out to the Windows team internally to see if they had any input and was waiting for a response.
The .NET SDK images don't change anything regarding certificates from the base Windows container images. I recommend you check for the certificates in the latest mcr.microsoft.com/windows/servercore:ltsc2019-amd64 image and compare it to a prior version. I think the previous version should be mcr.microsoft.com/windows/servercore@sha256:7a52edfa3a431d8758e70a42d96a6f3b5bd90865ffcc2e0b9222f3560062cb5a (based on this diff). Otherwise just check against the prior .NET image.
If the behavior is from the Windows base image, then you should file an issue on the microsoft/windows-containers repo.
/cc @dotnet/nuget-team - any additional troubleshooting advice? Thanks.
You can use the disableTLSCertificateValidation attribute in your source to disable TLS certificate validation. For more information https://learn.microsoft.com/en-us/nuget/reference/nuget-config-file#package-source-sections
example
<packageSources>
<add key="Invalid-certificate-https-source" value="https://httpsSourceTrusted/" disableTLSCertificateValidation="true" />
</packageSources>
You can use the
disableTLSCertificateValidationattribute in your source to disable TLS certificate validation.
Note that you should never disable TLS in a production environment. That would make you vulnerable to man-in-the-middle attacks, among other things.
@dotnet/ncl should have better advice for HttpClient troubleshooting than NuGet.
I've seen errors before along the lines of "trust could not be established", but this thread has error message "The remote certificate is invalid", which makes me guess that the certificate itself it corrupt/incomplete.
Do other containers on the same host have any networking problems? Can you use something like PowerShell's Invoke-WebRequest to test networking to https endpoints before running dotnet restore? Alternatively, use a console app without any package references that does await (new HttpClient()).GetAsync("https://api.nuget.org/v3/index.json");
In any case, from the NuGet side we just rely on HttpClient "working", so the NCL team should have better advice on how to debug when it can't communicate with a remote server.
I am not able to repro the issue using either
- mcr.microsoft.com/dotnet/sdk:8.0.407-windowsservercore-ltsc2019
- mcr.microsoft.com/dotnet/sdk:9.0.201-windowsservercore-ltsc2019
If you still face these errors, it would be useful to get the certificate information from inside the container. You can use following code to dump the cert info from HttpClient
You might have to compile it on your dev machine and pass the compiled binary to the docker image if compiling it inside of the image does not work.
SocketsHttpHandler handler = new SocketsHttpHandler
{
SslOptions = new System.Net.Security.SslClientAuthenticationOptions
{
RemoteCertificateValidationCallback = (sender, certificate, chain, sslPolicyErrors) =>
{
// dump all certificate information for debugging purposes
if (chain != null)
{
for (int i = 0; i < chain.ChainElements.Count; i++)
{
var element = chain.ChainElements[i];
Console.WriteLine($"Certificate {i}:");
Console.WriteLine(element.Certificate.ToString(true));
foreach (var status in element.ChainElementStatus)
{
Console.WriteLine($" Status: {status.Status}");
if (status.StatusInformation.Length > 0)
{
Console.WriteLine($" Status Information: {status.StatusInformation}");
}
}
Console.WriteLine();
}
}
console.WriteLine($"SSL Policy Errors: {sslPolicyErrors}");
// accept only if there are no errors
return sslPolicyErrors == System.Net.Security.SslPolicyErrors.None;
}
},
};
using var client = new HttpClient(handler);
await client.GetAsync("https://api.nuget.org./v3/index.json");
And share the result here.
The nuget.org cert seems to be rooted in DigiCert Global Root G3, so if you can confirm that the root cert is missing in the container, e.g. via following
using System.Security.Cryptography.X509Certificates;
string certThumbPrint = "7E04DE896A3E666D00E687D33FFAD93BE83D349E"; // got this from the output of the code in previous comment
X509Store store = new X509Store(StoreName.Root, StoreLocation.LocalMachine);
store.Open(OpenFlags.ReadOnly);
X509Certificate2Collection certs = store.Certificates.Find(X509FindType.FindByThumbprint, certThumbPrint, false);
if (certs.Count == 0)
{
Console.WriteLine($"Certificate with thumbprint {certThumbPrint} not found.");
}
else
{
Console.WriteLine($"Certificate with thumbprint {certThumbPrint} found.");
foreach (var cert in certs)
{
Console.WriteLine($"Subject: {cert.Subject}");
Console.WriteLine($"Issuer: {cert.Issuer}");
Console.WriteLine($"Valid From: {cert.NotBefore}");
Console.WriteLine($"Valid To: {cert.NotAfter}");
}
}
Then you should file an issue with the base image as https://github.com/dotnet/dotnet-docker/issues/6325#issuecomment-2758457952 suggests
other part to consider is network location. ... especially if revocation check is enabled. But the partial change would also show if the intermediates are missing and the system fails to download them.
Sorry for the slow follow up, but this isn't my area of expertise, so forgive me if I've been going about this wrong.
I haven't been able to find or figure out the tags for the windowsservercore image, so I haven't been able to check historical behavior. I tried invoking web requests to replicate the issue, but they've all been successful. If there was a certificate issue, I assume a subsequent release fixed it. I haven't been able to replicate the dotnet restore example myself, so I assume that's a red herring.
I'm still having issues using the new releases, but I think it's out of scope for this project/repository. I'll include some of the confounding variables here for others' reference and close the ticket.
3/11 is roughly the time when the images were updated, dotnet-sonarscanner was updated to v10, and a TeamCity update was applied to our instance. Our builds use TeamCity's Container Wrapper feature to run build steps in containers--each step gets a separate container.
After reviewing the build history, I've ruled out the TeamCity update being a factor, but I mention it here for completeness.
The dotnet-sonarscanner is a .NET tool which requires two steps, a begin and an end, that wrap the normal build and test steps. Again, no issue with distinct steps on old images, but on newer images the dotnet-sonarscanner end step throws an exception: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target. However, when our normally distinct steps are combined into one--ex. a PowerShell script--and all run within a single container, things are successful.
This indicates to me the newer images are fine and capable and that some other factor is the cause. As far as I can tell, relevant side effects from the dotnet-sonarscanner step (and other steps) are being appropriately persisted and passed through volume from one container to the next, but I have to assume at this point that I'm missing something, so I'll continue to plead my case with that community. There has been a similar issue reported, so hopefully someone there has insights.
FYI the issue is consistent across versions 9 and 10 of the dotnet-sonarscanner, so I presume it's a more fundamental assumption in the dotnet-sonarscanner that was violated in the newer images.