mssql-docker
mssql-docker copied to clipboard
Unable to start container on Linux 6.7
Currently unable to start the container on Arch Linux as the host OS. The dump files for the failing sqlservr process don't really provide any insight as to why:
docker compose logs db
db-1 | SQL Server 2022 will run as non-root by default.
db-1 | This container is running as user mssql.
db-1 | Your master database file is owned by mssql.
db-1 | To learn more visit https://go.microsoft.com/fwlink/?linkid=2099216.
db-1 | This program has encountered a fatal error and cannot continue running at Mon Jan 15 18:19:00 2024
db-1 | The following diagnostic information is available:
db-1 |
db-1 | Reason: 0x00000001
db-1 | Signal: SIGABRT - Aborted (6)
db-1 | Stack:
db-1 | IP Function
db-1 | ---------------- --------------------------------------
db-1 | 000064eb280a3ce1 std::__1::bad_function_call::~bad_function_call()+0x96661
db-1 | 000064eb280a36a6 std::__1::bad_function_call::~bad_function_call()+0x96026
db-1 | 000064eb280a2c2f std::__1::bad_function_call::~bad_function_call()+0x955af
db-1 | 00007c18f8810520 __sigaction+0x50
db-1 | 00007c18f88649fc pthread_kill+0x12c
db-1 | 00007c18f8810476 raise+0x16
db-1 | 00007c18f87f67f3 abort+0xd3
db-1 | 000064eb28074d96 std::__1::bad_function_call::~bad_function_call()+0x67716
db-1 | 000064eb280b15b4 std::__1::bad_function_call::~bad_function_call()+0xa3f34
db-1 | 000064eb280df318 std::__1::bad_function_call::~bad_function_call()+0xd1c98
db-1 | 000064eb280df0fa std::__1::bad_function_call::~bad_function_call()+0xd1a7a
db-1 | 000064eb2807b20a std::__1::bad_function_call::~bad_function_call()+0x6db8a
db-1 | 000064eb2807ae80 std::__1::bad_function_call::~bad_function_call()+0x6d800
db-1 | Process: 10 - sqlservr
db-1 | Thread: 157 (application thread 0x264)
db-1 | Instance Id: 83ef72ce-1100-44c4-913c-45d0df61ae44
db-1 | Crash Id: 05e56c63-9bd1-47db-b3d5-c1f58cebd578
db-1 | Build stamp: a9299dd605c652a3cea4246273441bcfaf48afb4b482ab9dc43771eecaf6600b
db-1 | Distribution: Ubuntu 22.04.3 LTS
db-1 | Processors: 32
db-1 | Total Memory: 67119079424 bytes
db-1 | Timestamp: Mon Jan 15 18:19:00 2024
db-1 | Last errno: 2
db-1 | Last errno text: No such file or directory
db-1 | Capturing a dump of 10
db-1 | Successfully captured dump: /var/opt/mssql/log/core.sqlservr.1_15_2024_18_19_0.10
db-1 | Executing: /opt/mssql/bin/handle-crash.sh with parameters
db-1 | handle-crash.sh
db-1 | /opt/mssql/bin/sqlservr
db-1 | 10
db-1 | /opt/mssql/bin
db-1 | /var/opt/mssql/log/
db-1 |
db-1 | 83ef72ce-1100-44c4-913c-45d0df61ae44
db-1 | 05e56c63-9bd1-47db-b3d5-c1f58cebd578
db-1 |
db-1 | /var/opt/mssql/log/core.sqlservr.1_15_2024_18_19_0.10
db-1 |
db-1 | Ubuntu 22.04.3 LTS
db-1 | Capturing core dump and information to /var/opt/mssql/log...
Docker-compose file:
version: '3'
services:
db:
image: 'mcr.microsoft.com/mssql/server:2022-latest'
environment:
- ACCEPT_EULA=Y
- MSSQL_SA_PASSWORD=<there would be a password here>
- MSSQL_PID=Developer
volumes:
- ./logs:/var/opt/mssql/log
- ./data:/var/opt/mssql/data
ports:
- 1433:1433
Docker logs and data directory are set as UID:GID 10001:10001.
I have the same issue. Found that it's the 6.7 kernel update. (https://github.com/microsoft/mssql-docker/issues/858#issuecomment-1892216070)
Rolling back to 6.6.10 makes it work again.
I experienced the same behavior today. First my existing container grew in size very quickly. I tried creating other containers but they all failed with the above message.
It took me a while to figure out that downgrading my kernel fixes the issue, but downgrading to 6.6.11 did the trick.
I can also confirm, I have the same behaviour. It works with Kernel 6.6 and with 6.7 I get a similiar Message as above.
I downgraded my kernel and the container now functions.
Is this limited to just this container or docker needing to update something to be compatible with the 6.7 kernel?
I have same problem running container in Podman, but the Docker container is running without any problem. I simply pulled the image sudo podman pull mcr.microsoft.com/mssql/server:2022-latest
, and ran it:
sudo podman run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Str0ngPass!" -p 1433:1433 --name sql-test --hostname sql-test -d mcr.microsoft.com/mssql/server:2022-latest
Attached is a log file. sql-test.log
Can confirm on Arch Linux, both the docker images for versions 2017, 2019 and 2022 and the AUR version give the same result.
Last errno text: No such file or directory
After downgrading the kernel to version 6.6.10-arch1-1
it starts successfully.
I can confirm this on Nobara 39 with 6.7.0 kernel. Exactly same issue for 2017, 2019, 2022 mssql. 6.6.9 works fine.
It seems like this was solved in the aur repo package mssql-server
: https://aur.archlinux.org/packages/mssql-server#comment-953063.
However I'm still having trouble building the needed dependency to verify...
For what it is worth:
running Gentoo with custom 6.7.x kernel. It looks like it fails trying to access cgroup v1 "/sys/fs/cgroup/memory/memory.limit_in_bytes". I suspect that switching to cgroup to "hybrid" would fix the issue but I am not up to rebooting my machine now.
$ docker run -it --rm -e ACCEPT_EULA=Y -e MSSQL_PID=Developer mcr.microsoft.com/mssql/server:2022-latest -- /bin/bash
sleep 1000
in another terminal, run
ps fax|less
# find pid of bash which is parent of sleep
sudo strace -o mssql.strace -f -s1000 -p <bash-in-mssql-docker>
return to the first terminal, Ctrl-C
the sleep and run /opt/mssql/bin/sqlservr
. Run /opt/mssql/bin/sqlservr
and wait for it to crash. Go to the seconf terminal, interrupt strace.
$ grep -P '"/(proc|sys).*ENOENT' mssql.strace
9999 openat(AT_FDCWD, "/sys/fs/cgroup/memory/memory.limit_in_bytes", O_RDONLY) = -1 ENOENT (No such file or directory)
I think the ENOENT
is not the issue, especially not /sys/fs/cgroup/memory/memory.limit_in_bytes
since this doesn't exist on Kernel 6.6.13 either, and mssql runs fine there.
My crashlogs on 6.7.1 showed Invalid argument / 22 / EINVAL:
This program has encountered a fatal error and cannot continue running at Mon Jan 22 18:09:17 2024
The following diagnostic information is available:
Reason: 0x00000001
Signal: SIGABRT - Aborted (6)
Stack:
IP Function
---------------- --------------------------------------
0000613cdff2ace1 std::__1::bad_function_call::~bad_function_call()+0x96661
0000613cdff2a6a6 std::__1::bad_function_call::~bad_function_call()+0x96026
0000613cdff29c2f std::__1::bad_function_call::~bad_function_call()+0x955af
0000753f7ee4d520 __sigaction+0x50
0000753f7eea19fc pthread_kill+0x12c
0000753f7ee4d476 raise+0x16
0000753f7ee337f3 abort+0xd3
0000613cdfefbd96 std::__1::bad_function_call::~bad_function_call()+0x67716
Process: 10 - sqlservr
Thread: 161 (application thread 0x278)
Instance Id: ba778b4b-ea20-4f3c-98fa-2002d4c8e68c
Crash Id: 3674de73-5de7-494e-8530-2520421dd97f
Build stamp: a9299dd605c652a3cea4246273441bcfaf48afb4b482ab9dc43771eecaf6600b
Distribution: Ubuntu 22.04.3 LTS
Processors: 16
Total Memory: 29180137472 bytes
Timestamp: Mon Jan 22 18:09:17 2024
Last errno: 22
Last errno text: Invalid argument
The problem is still there with kernel 6.7.2
same problem on 6.7.1-arch1-1
As a bad side effect the lsof process it spawns starts eating a core
Hello,
The issue has been identified and should be fixed in the next SQL Server 2022 CU, but we cannot commit to a specific CU release or timeline as sometimes plans can unexpectedly change. No further investigation or data points should be needed for this, but thank you all for reporting and for looking for potential causes.
It is unrelated to cgroups, and at first glance it might be a kernel bug (but do not quote me on this) - it appears that as of 6.7, mmap
without MAP_FIXED
may sometimes ignore the address
hint even if the hinted region is in fact available. I have not investigated the kernel side of things further, but I think it might be related to this series of changes and/or its preceding/following changes.
Knowing this, I cannot think of any workaround other than sticking to 6.6 in the meantime.
Hello,
The issue has been identified and should be fixed in the next SQL Server 2022 CU, but we cannot commit to a specific CU release or timeline as sometimes plans can unexpectedly change. No further investigation or data points should be needed for this, but thank you all for reporting and for looking for potential causes.
It is unrelated to cgroups, and at first glance it might be a kernel bug (but do not quote me on this) - it appears that as of 6.7,
mmap
withoutMAP_FIXED
may sometimes ignore theaddress
hint even if the hinted region is in fact available. I have not investigated the kernel side of things further, but I think it might be related to this series of changes and/or its preceding/following changes.Knowing this, I cannot think of any workaround other than sticking to 6.6 in the meantime.
Thank you very much for the patch. Are there plans to also backport it to 2019?
Just wanted to write to say I am so glad you have all written on here, I didn't even think about the fact I just upgraded my arch system, I was about to start tearing things apart this has saved me a heck of a lot of time, whilst I am here to say thank you, I can also confirm this is still happening on Arch Linux on 6.7.4
Hi! We are running a msql based prosject on a mac and use the image mcr.microsoft.com/mssql/server:2019-latest
through Podman. Podman will not start a container with this image since the kernel was updated. How kan we revert the kernel version of the host or is there another workaround? Any help would be highly appreciated. Thanks!
Same issue with Fedora 39 on 6.7.2 and 6.7.3, but fine on 6.6.x and 6.5.x (in case anyone is searching for this issue and using Fedora). Looking forward to the CU @fbrosseau
I think MSFT should strongly consider backporting this at least to SQL Server 2019 if not even 2017 as well. As people continue to upgrade their kernels this is going to be happening on an ever larger scale to existing SQL Server linux / container installations.
Thank you very much for the patch. Are there plans to also backport it to 2019?
Am I missing something? I do not see any updated Docker images for mcr.microsoft.com/mssql/server:2022-latest
that would make it run on 6.7.*
.
Thank you very much for the patch. Are there plans to also backport it to 2019? Am I missing something? I do not see any updated Docker images for
mcr.microsoft.com/mssql/server:2022-latest
that would make it run on6.7.*
.
It should be included in the next CU, no date estimate
The issue has been identified and should be fixed in the next SQL Server 2022 CU, but we cannot commit to a specific CU release or timeline as sometimes plans can unexpectedly change. No further investigation or data points should be needed for this, but thank you all for reporting and for looking for potential causes.
I've been keeping an eye on this page for a presumably CU12 to be released.
Not working on 6.7.5
either.
I am glad I ran into this page. This started happening recently on Fedora 39. Kernel 6.7.4. I will test another kernel and report back.
Edit: Works on 6.6.13.
mysql, pgsql and sqlite all work no problem. but m$ seems to be able to afford not to give a crap about a regression in the latest kernel. not amused.
─ docker logs 9536fdc556e1 ─╯ This program has encountered a fatal error and cannot continue running at Tue Feb 27 19:29:45 2024 The following diagnostic information is available:
Reason: 0x00000001
Signal: SIGABRT - Aborted (6)
Stack:
IP Function
---------------- --------------------------------------
000056dc072752fc <unknown>
000056dc07274d42 <unknown>
000056dc07274351 <unknown>
00007c8fbb447090 killpg+0x40
00007c8fbb44700b gsignal+0xcb
00007c8fbb426859 abort+0x12b
000056dc071fb3d2 <unknown>
000056dc07287304 <unknown>
000056dc072bc388 <unknown>
000056dc072bc16a <unknown>
000056dc0720724a <unknown>
000056dc07206e9f <unknown>
Process: 12 - sqlservr
Thread: 83 (application thread 0x134)
Instance Id: 252d75bf-d3a4-4b38-a78f-b83488b53759
Crash Id: 855b8579-9053-4856-ad38-69e4a54d6ff6
Build stamp: e149a9e980d9936d4f4a616b06112de0e7b2f4e45c2cd3a0884ae319ad3d13b7
Distribution: Ubuntu 20.04.6 LTS Processors: 12 Total Memory: 16618233856 bytes Timestamp: Tue Feb 27 19:29:45 2024 Last errno: 2 Last errno text: No such file or directory Capturing a dump of 12 Successfully captured dump: /var/opt/mssql/log/core.sqlservr.2_27_2024_19_29_45.12 Executing: /opt/mssql/bin/handle-crash.sh with parameters handle-crash.sh /opt/mssql/bin/sqlservr 12 /opt/mssql/bin /var/opt/mssql/log/
252d75bf-d3a4-4b38-a78f-b83488b53759
855b8579-9053-4856-ad38-69e4a54d6ff6
/var/opt/mssql/log/core.sqlservr.2_27_2024_19_29_45.12
Ubuntu 20.04.6 LTS Capturing core dump and information to /var/opt/mssql/log... /bin/cat: /proc/12/maps: Permission denied SQL server is unavailable - sleeping
Any plans to upgrade the Docker image to resolve this issue?
Well, first there needs to be a new CU release, the last one is from january 2024 and there seems to be a pace of about 1 release per month, so a new release is about to be expected. But the team is not communicating release dates, so we can only wait at this point in time.
Keep track of this page to see whether a new CU is released.
It's a bit infuriating that we need to wait for a critical bug fix to land on a monthly cumulative update without being even certain whether it actually will.
It would be much more productive instead to post here instructions on how to migrate the database to postgres and be done with it lol
What the hell same issue here
When can we hope for CU12 that will include the fix?
It's been 2 weeks already.