mssql-docker
mssql-docker copied to clipboard
Healthcheck
Would it be possible to implement Docker's HEALTHCHECK command?
Yeah, now that we have sqlcmd in the image this should be much easier. I'll put it on our backlog internally. If you need something sooner, I'd suggest creating your own image FROM ours and specifying a HEALTCHECK that uses sqlcmd with -q to execute a simple query periodically like 'SELECT 1'.
I too am having an issue where the docker image crashes, but continues to output lines of logging, which prevents my internal hung-build-killer tool to detect that something went wrong.
2017-09-06 09:50:11.21 spid90 Timeout occurred while waiting for latch: class 'LOG_MANAGER', id 00000005ED740BB0, type 4, Task 0x00000005EB49B088 : 0, waittime 71400 seconds, flags 0x1a, owning task 0x00000005EB492CA8. Continuing to wait. 2017-09-06 09:50:25.26 spid89 Timeout occurred while waiting for latch: class 'LOG_MANAGER', id 00000005ED740BB0, type 4, Task 0x00000005EEDE7088 : 0, waittime 71400 seconds, flags 0x1a, owning task 0x00000005EB492CA8. Continuing to wait. 2017-09-06 09:51:25.25 spid84 Timeout occurred while waiting for latch: class 'LOG_MANAGER', id 00000005ED740BB0, type 4, Task 0x00000005E800DC28 : 0, waittime 71400 seconds, flags 0x1a, owning task 0x00000005EB492CA8. Continuing to wait.
having a healthcheck mechanism in the case of this would be great.
I get the same crash / error after using netcat to ascertain when the process has started listening on port 1433
This is super useful in order to avoid having to develop custom poll and wait script to check whether the DB is up in order to be able to start other activities.
Dockerfile:
FROM microsoft/mssql-server-linux
HEALTHCHECK --interval=10s --timeout=3s --start-period=10s --retries=10 \
CMD sqlcmd -S localhost -U sa -P ${SA_PASSWORD} -Q "SELECT 1" || exit 1
It should be
CMD /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SA_PASSWORD} -Q "SELECT 1" || exit 1
Because sqlcmd is not on PATH
Is there a simple way to use HEALTHCHECK?
I use this piece of code, but maybe you can advise something simpler?
https://github.com/ObjectivityLtd/DBTestCompare/blob/master/restoreBackup.ps1#L7
do { $JSON=docker inspect --format='{{json .State.Health}}' sqlserver-container | Out-String | ConvertFrom-Json Write-Host HEALTHCHECK: $JSON.Status.ToString() sqlserver-container Start-Sleep -s 10 } Until ($JSON.Status.ToString() -eq 'healthy')
version: "3.4"
services:
db:
image: "mcr.microsoft.com/mssql/server:2019-latest"
environment:
SA_PASSWORD: "Secure-Password.1234"
ACCEPT_EULA: "Y"
healthcheck:
test: /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P "$$SA_PASSWORD" -Q "SELECT 1" || exit 1
interval: 10s
timeout: 3s
retries: 10
start_period: 10s
Everything that is written above must be changed: required with the option: -b and I recommend using the option: -o /dev/null in docker-compose.yaml
healthcheck:
test: /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P "$${SA_PASSWORD}" -Q "SELECT 1" -b -o /dev/null
interval: 10s
timeout: 3s
retries: 10
start_period: 10s
in Dockerfile:
HEALTHCHECK --interval=10s --timeout=3s --start-period=10s --retries=10 \
CMD /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $${SA_PASSWORD} -Q "SELECT 1" -b -o /dev/null
I'm curious here, why do we need to have exit 1
if sqlcmd -Q
already has a built-in exit?
-Q "cmdline query" (and exit)
https://docs.microsoft.com/en-us/sql/tools/sqlcmd-utility?view=sql-server-ver15#syntax
Yes, exit 1
is not needed. Updated
These solution don't take into account that during database migration after an update, the database can be locked for several minutes. To the developers: Is it possible to take that into account? Does sql-server have an idiomatic way to determine that it is currently upgrading databases? Are you maybe working on a HEALTHCHECK command that is working with that scenario?
The best idea so far to work around this is to add --start-period=10m
to the healthcheck.
Thanks @mrlioncub 🎉 I was searching for a solution for using the image in a GitHub Action and the healthcheck wasn't working...thankfully Google found your comment.
Passing -b -o /dev/null
solved the issue for me. Sharing for future reference:
name: CI
on: push
jobs:
test:
name: Test
runs-on: ubuntu-latest
env:
DB_USER: SA
DB_PASSWORD: yourStrongP@ssword
DB_NAME: your_db
services:
mssql:
image: mcr.microsoft.com/mssql/server:2017-latest-ubuntu
env:
ACCEPT_EULA: Y
SA_PASSWORD: ${{ env.DB_PASSWORD }}
MSSQL_PID: Express
DB_USER: ${{ env.DB_USER }}
DB_NAME: ${{ env.DB_NAME }}
ports:
- 1433/tcp
options: >-
--health-cmd "/opt/mssql-tools/bin/sqlcmd -U $DB_USER -P $SA_PASSWORD -Q 'select 1' -b -o /dev/null"
--health-interval 60s
--health-timeout 30s
--health-start-period 20s
--health-retries 3
steps:
- name: Create Database
run: |
docker exec $(docker ps -alq) /opt/mssql-tools/bin/sqlcmd -U "$DB_USER" -P "$DB_PASSWORD" -Q "CREATE DATABASE $DB_NAME;"
- uses: actions/checkout@v2
- name: Setup Node Package Cache
uses: actions/cache@v2
with:
path: ~/.npm
key: ${{ runner.OS }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install
run: |
npm ci --prefer-offline --no-audit
- name: Test
env:
DB_HOST: localhost
DB_PORT: ${{ job.services.mssql.ports[1433] }}
DB_USER: ${{ env.DB_USER }}
DB_PASSWORD: ${{ env.DB_PASSWORD }}
DB_NAME: ${{ env.DB_NAME }}
run: |
npm test
yes ,I have the same problem,use these can't resolve. how to config sqlserver healthycheck?
BEST PRACTICE
services:
instance:
container_name: sql-server-2022
image: mcr.microsoft.com/mssql/server:2022-latest
restart: unless-stopped
environment:
ACCEPT_EULA: Y
MSSQL_SA_PASSWORD: SA-PASSWORD
MSSQL_PID: Enterprise
ports:
- 2222:1433
healthcheck:
test: /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P "SA-PASSWORD" -Q "SELECT 1" -b -o /dev/null
interval: 10s
timeout: 3s
retries: 10
start_period: 10s
volumes:
- mssql:/var/opt/mssql
networks:
- internal
I disagree with the "best practice" of using -o /dev/null
in the parameters. If you do that then the healthcheck output as seen in docker container inspect <container_id>
is an empty string - which can hide clues about which part of the network/SQL stack is responsible for the failure.
If you do not include -o /dev/null
in the parameters then you can see the actual error message as demonstrated in this snippet:
{
"Health": {
"Status": "unhealthy",
"FailingStreak": 1,
"Log": [
{
"Start": "2023-05-23T05:41:54.699717693Z",
"End": "2023-05-23T05:41:54.827199774Z",
"ExitCode": 1,
"Output": "Msg 102, Level 15, State 1, Server 0999316e8300, Line 1\nIncorrect syntax near 'Error'.\n"
}
]
}
}
Hi,
I'm curious why it's best practice to execute a command with a clear text password that can be read from anywhere on the system e.g. with ps -a
. Is there really no other option to ask the mssql process if it's running and at good health? Something like pgrep -f /opt/mssql/bin/sqlservr && exit 1 || exit 0
would at least be more secure, but presumably not as reliable.
Hi, I'm curious why it's best practice to execute a command with a clear text password that can be read from anywhere on the system e.g. with
ps -a
. Is there really no other option to ask the mssql process if it's running and at good health? Something likepgrep -f /opt/mssql/bin/sqlservr && exit 1 || exit 0
would at least be more secure, but presumably not as reliable.
You have two options here:
a) use SQLCMDPASSWORD environment variable to pass the password b) use the new go-sqlcmd tool, which allows you to define a config file, from where it can read the password.
since /opt/mssql-tools/bin/sqlcmd
is NOT available on linux/arm64
(Mac M1, M2, etc) for mcr.microsoft.com/azure-sql-edge:latest
Can someone please provide a command that will work with whatever flavour of linux mcr.microsoft.com/azure-sql-edge:latest
is using? This way no additional tools need to be installed, if possible.
I realize it's a different image, but the ask is fairly generic: how does one test for port being open, and accepting connections w/o using specific tooling such as sqlcmd
@clearwaterstream Taking inspiration from StackOverflow [1], I was able to use this since timeout
is available in the image:
healthcheck:
test: timeout 1 bash -c 'cat < /dev/null > /dev/tcp/127.0.0.1/1433'
- https://stackoverflow.com/a/19866239/84473