[Bug]: MsSql health check does not complete on newest container image
Testcontainers version
3.9.0
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host arch
x64
.NET version
8.0.303
Docker version
Client:
Version: 25.0.5
API version: 1.44
Go version: go1.21.10
Git commit: d260a54c81efcc3f00fe67dee78c94b16c2f8692
Built: Sun May 12 07:25:43 2024
OS/Arch: linux/amd64
Context: default
Server:
Engine:
Version: 25.0.5
API version: 1.44 (minimum version 1.24)
Go version: go1.21.10
Git commit: e63daec8672d77ac0b2b5c262ef525c7cf17fd20
Built: Sun May 12 07:25:43 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.7.10
GitCommit: 4e1fe7492b9df85914c389d1f15a3ceedbb280ac
runc:
Version: 1.1.12
GitCommit: 51d5e94601ceffbbd85688df1c928ecccbfa4685
docker-init:
Version: 0.19.0
GitCommit:
Docker info
Client:
Version: 25.0.5
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.12.0
Path: /usr/libexec/docker/cli-plugins/docker-buildx
Server:
Containers: 29
Running: 5
Paused: 0
Stopped: 24
Images: 15
Server Version: 25.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4e1fe7492b9df85914c389d1f15a3ceedbb280ac
runc version: 51d5e94601ceffbbd85688df1c928ecccbfa4685
init version:
Security Options:
seccomp
Profile: builtin
Kernel Version: 5.15.153.1-microsoft-standard-WSL2
Operating System: Rancher Desktop WSL Distribution
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 15.58GiB
Name: CCD-0024
ID: 398be532-db59-47b3-bcf7-d989f4f09517
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support
What happened?
When using the MsSql package with the newest container image mcr.microsoft.com/mssql/server:2022-latest with a digest of sha256:c1aa8afe9b06eab64c9774a4802dcd032205d1be785b1fd51e1c0151e7586b74, the health check specified in the waiting strategy never completes, even though the logs of the SQL server container show it being ready, leading to a timeout.
This behavior is not present when using a slightly older container image version, e.g. mcr.microsoft.com/mssql/server:2022-CU13-ubuntu-22.04 with a digest of sha256:c4369c38385eba011c10906dc8892425831275bb035d5ce69656da8e29de50d8.
Relevant log output
[testcontainers.org 00:00:00.38] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:00.38] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:00.38] Searching Docker registry credential in Auths
[testcontainers.org 00:00:00.38] Docker registry credential https://index.docker.io/v1/ found
[testcontainers.org 00:00:01.50] Docker image testcontainers/ryuk:0.6.0 created
[testcontainers.org 00:00:01.58] Docker container 8d1b2fa17535 created
[testcontainers.org 00:00:01.64] Start Docker container 8d1b2fa17535
[testcontainers.org 00:00:01.96] Wait for Docker container 8d1b2fa17535 to complete readiness checks
[testcontainers.org 00:00:01.96] Docker container 8d1b2fa17535 ready
[testcontainers.org 00:00:01.97] Searching Docker registry credential in Auths
[testcontainers.org 00:00:01.97] Searching Docker registry credential in Auths
[testcontainers.org 00:00:01.97] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:01.97] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:01.97] Docker registry credential mcr.microsoft.com not found
[testcontainers.org 00:00:18.94] Docker image mcr.microsoft.com/mssql/server:2022-latest created
[testcontainers.org 00:00:18.96] Docker container 4a3b482d21c9 created
[testcontainers.org 00:00:18.97] Start Docker container 4a3b482d21c9
[testcontainers.org 00:00:19.20] Wait for Docker container 4a3b482d21c9 to complete readiness checks
[testcontainers.org 00:00:19.20] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
[testcontainers.org 00:00:20.27] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
[testcontainers.org 00:00:21.32] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
[testcontainers.org 00:00:22.42] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
[testcontainers.org 00:00:23.58] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
[testcontainers.org 00:00:24.69] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
...
[testcontainers.org 00:03:37.79] Execute "/opt/mssql-tools/bin/sqlcmd -Q SELECT 1;" at Docker container 4a3b482d21c9
Additional information
No response
We see the same problem at the moment.
This is also affecting us when it's running inside our GitHub Actions for CI/CD. It's currently preventing us from doing any releases.
I confirm that our tests using TestContainers and MsSQL stopped passing today 🤕
When looking at the image it seems that path for sqlcmd has changed from /opt/mssql-tools/bin/sqlcmd to /opt/mssql-tools18/bin/sqlcmd. Not sure if this was intentional or not.
FYI someone has reported it on MSSQL-Docker: https://github.com/microsoft/mssql-docker/issues/892
As mentioned in Slack, we likely need to adapt the default wait strategy (see https://github.com/testcontainers/testcontainers-dotnet/blob/develop/src/Testcontainers.MsSql/MsSqlBuilder.cs#L132-L145).
Users can provide their own wait strategy configuration as a workaround.
This started blocking our Azure DevOps pipeline yesterday.
after @pascalberger comment en combined with @kiview i first ran into certificate issues:
Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : SSL Provider: [error:0A000086:SSL routines::certificate verify failed:self-signed certificate]. Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : Client unable to establish connection. For solutions related to encryption errors, see https://go.microsoft.com/fwlink/?linkid=2226722.
but got it working for now by also adding the -C option:
.WithWaitStrategy( Wait.ForUnixContainer() .UntilCommandIsCompleted("/opt/mssql-tools18/bin/sqlcmd", "-C", "-Q", "SELECT 1;") )
This works when run locally, but still times out when run in an Azure DevOps pipeline:
new MsSqlBuilder()
.WithImage("mcr.microsoft.com/mssql/server:2022-latest")
.WithEnvironment("ACCEPT_EULA", "Y")
.WithPortBinding(11143, 1433)
.WithWaitStrategy(
Wait.ForUnixContainer()
.UntilCommandIsCompleted(
"/opt/mssql-tools/bin/sqlcmd",
"-C",
"-Q",
"SELECT 1;"
)
)
.Build();
This times out in both:
new MsSqlBuilder()
.WithImage("mcr.microsoft.com/mssql/server:2022-latest")
.WithEnvironment("ACCEPT_EULA", "Y")
.WithPortBinding(11143, 1433)
.WithWaitStrategy(
Wait.ForUnixContainer()
.UntilCommandIsCompleted(
"/opt/mssql-tools18/bin/sqlcmd",
"-C",
"-Q",
"SELECT 1;"
)
)
.Build();
I have been able to replicate this locally by deleting the cached 2022-latest container image. After it downloads the latest image, it hangs indefinitely.
Adding .WithWaitStrategy( Wait.ForUnixContainer() .UntilCommandIsCompleted("/opt/mssql-tools18/bin/sqlcmd", "-C", "-Q", "SELECT 1;") ) resolved the issue. Thanks @Fireblade954!
@tscrip that did it. We missed a test project so had a false negative. Thanks!
This is also affecting .NET Aspire - https://github.com/dotnet/aspire/issues/5057
After reading all these comments, I would like to point out that we recommend pinning the image version. Using the latest tag does not automatically update the cached image on your development machine; it will use the version it pulled weeks ago. Meanwhile, the ephemeral CI pipeline pulls the actual latest version because it is not cached (this may lead to different behaviors on developer machines and in the CI pipeline).
Since it looks like the new path will remain (https://github.com/microsoft/mssql-docker/issues/892#issuecomment-2249029917), we can update the default wait strategy for the new version. Overriding the wait strategy, as @Fireblade954 suggested, or pinning the version are workarounds to avoid this issue.
We can probably do something similar to what we are doing in the MongoDB module to determine which binary (path) is available.
BTW this will also break the ExecScriptAsync method as it also uses sqlcmd. (additionally they are now defaulting to encryption required, which means you need to pass -C with the sqlcmd to tell it to trust the server cert).
This works for us:
.WithWaitStrategy(Wait.ForUnixContainer().UntilPortIsAvailable(1433))
This works for us:
.WithWaitStrategy(Wait.ForUnixContainer().UntilPortIsAvailable(1433))
This won't always work as the container might be ready but MSSQL might not be ready to receive requests.
This works for us: .WithWaitStrategy(Wait.ForUnixContainer().UntilPortIsAvailable(1433))
This won't always work as the container might be ready but MSSQL might not be ready to receive requests.
Thats true, although it fails very rarely, atleast for us, and it will usually work, regardless of old or new image from microsoft.
I have now rewritten to use
.WithWaitStrategy( Wait.ForUnixContainer() .UntilCommandIsCompleted("/opt/mssql-tools18/bin/sqlcmd", "-C", "-Q", "SELECT 1;") ) and downloaded the new image locally.