azure-cosmos-db-emulator-docker icon indicating copy to clipboard operation
azure-cosmos-db-emulator-docker copied to clipboard

Docker CosmosDB emulator crash - Kernel bug check - OS Fatal Error

Open timabell opened this issue 11 months ago • 2 comments

Describe the bug

Cosmos emulator running in docker container crashed while running large integration test suite. I don't know what test caused it to fall over.

This is an evaluation version.  There are [159] days left in the evaluation period.
2.14.21.0 (1c783d8f)
Copyright (C) Microsoft Corporation. All rights reserved.
Starting
Started 1/21 partitions
Started 2/21 partitions
Started 3/21 partitions
Started 4/21 partitions
Started 5/21 partitions
Started 6/21 partitions
Started 7/21 partitions
Started 8/21 partitions
Started 9/21 partitions
Started 10/21 partitions
Started 11/21 partitions
Started 12/21 partitions
Started 13/21 partitions
Started 14/21 partitions
Started 15/21 partitions
Started 16/21 partitions
Started 17/21 partitions
Started 18/21 partitions
Started 19/21 partitions
Started 20/21 partitions
Started 21/21 partitions
Started
This program has encountered a fatal error and cannot continue running at Mon Dec  9 11:56:28 2024
The following diagnostic information is available:

         Reason: OS Fatal Error (0x00000006)
        Message: Kernel bug check
        Address: 0x3fff93641070
     Parameters: 0x3fff8636b0c0
    Stack Trace:
                 file://package1/windows/system32/sqlpal.dll+0x0000000000209C12
                 file://package1/windows/system32/sqlpal.dll+0x0000000000208156
                 file://package1/windows/system32/sqlpal.dll+0x00000000002410C9
                 file:///windows/System32/Drivers/AfdTl.sys+0x0000000000002625
                 file:///windows/System32/Drivers/AfdTl.sys+0x0000000000001701
                 file:///windows/System32/Drivers/AfdWsk.sys+0x0000000000004DF7
                 file:///windows/System32/Drivers/AfdWsk.sys+0x0000000000005AB2
                 file:///windows/System32/Drivers/AfdWsk.sys+0x0000000000005762
                 file:///windows/System32/Drivers/AfdWsk.sys+0x000000000000201F
                 file://package1/windows/system32/sqlpal.dll+0x0000000000240288
                 file://package1/windows/system32/sqlpal.dll+0x000000000023EC24
                 file:///windows/System32/Drivers/AfdWsk.sys+0x00000000000056FF
                 file:///windows/System32/Drivers/Http.sys+0x000000000001756B
                 file:///windows/System32/Drivers/Http.sys+0x00000000000A0250
                 file:///windows/System32/Drivers/Http.sys+0x000000000008CE72
                 file://package1/windows/system32/sqlpal.dll+0x0000000000244B55
                 file://package1/windows/system32/sqlpal.dll+0x00000000003A0501
                 file://package1/windows/system32/sqlpal.dll+0x000000000020F2DA
                 <unknown>+0x00003FFFA0D880B0
        Process: 31 - cosmosdb-emulator
         Thread: 8570 (application thread 0x138)
    Instance Id: 972b0e72-34e2-46c2-9972-63f2c29a0bde
       Crash Id: c0b2ea72-c3c6-4c36-8a89-7a7fa3f00928
    Build stamp: (null)
   Distribution: Ubuntu 22.04.5 LTS
     Processors: 8
   Total Memory: 33502388224 bytes
      Timestamp: Mon Dec  9 11:56:28 2024
     Last errno: -34938881
Last errno text: Unknown error -34938881
Capturing a dump of 31
Successfully captured dump: /tmp/cosmos/appdata/log/core.cosmosdb-emulator.12_9_2024_11_56_28.31
Executing: /usr/local/bin/cosmos/bin/handle-crash.sh with parameters
     handle-crash.sh
     /usr/local/bin/cosmos/bin/cosmosdb-emulator
     31
     /usr/local/bin/cosmos/bin
     /tmp/cosmos/appdata/log/
     
     972b0e72-34e2-46c2-9972-63f2c29a0bde
     c0b2ea72-c3c6-4c36-8a89-7a7fa3f00928
     
     /tmp/cosmos/appdata/log/core.cosmosdb-emulator.12_9_2024_11_56_28.31

Ubuntu 22.04.5 LTS
Capturing core dump and information to /tmp/cosmos/appdata/log...
/usr/local/bin/cosmos/bin/crash-support-functions.sh: line 379: hash: lsof: not found
dmesg: read kernel buffer failed: Operation not permitted
/usr/local/bin/cosmos/bin/crash-support-functions.sh: line 426: journalctl: command not found
Dump already generated: /tmp/cosmos/appdata/log/core.cosmosdb-emulator.12_9_2024_11_56_28.31, moving to /tmp/cosmos/appdata/log/core.cosmosdb-emulator.31.temp/core.cosmosdb-emulator.31.gdmp
Moving logs to /tmp/cosmos/appdata/log/core.cosmosdb-emulator.31.temp/log/paldumper-debug.log

To Reproduce

Unknown

Expected behavior

Doesn't crash.

Desktop (please complete the following information):

Host machine:

╰─$ cat /etc/lsb-release
DISTRIB_ID=LinuxMint
DISTRIB_RELEASE=20.2
DISTRIB_CODENAME=uma
DISTRIB_DESCRIPTION="Linux Mint 20.2 Uma"
╰─$ uname -a
Linux fox 5.4.0-200-generic #220-Ubuntu SMP Fri Sep 27 13:19:16 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

timabell avatar Dec 09 '24 12:12 timabell

Don't know if it relates to #70 - not quite the same output

timabell avatar Dec 09 '24 12:12 timabell

Unfortunately because the entire container stops when the cosmos process crashes out, the /tmp volume is lost, so the mentioned log is unavailable when the container is restarted

$ docker exec devcontainer-cosmosdb-1  ls -l /tmp/cosmos/appdata/log/
total 0

timabell avatar Dec 09 '24 12:12 timabell