kismet Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)"

While running on an RPi 4 with gpsd coord, a USB rtl8187 wireless adapter, and internal bluetooth:

Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)"
Stack trace (most recent call last) in thread 22339:
#1    Object ", at 0xb6c4f5af, in
#0    Object "/usr/bin/kismet, at 0x3de0bf, in TerminationHandler()
Uncaught exception "std::exception"
FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down
FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down
Aborted

Jan 11 '20 08:01 andrewdefilippis

This is usually linked to either the 8187 drivers causing system-wide problems, or not having enough cpu or io to keep up.

I'd need a debug dump to try to track it down more - https://www.kismetwireless.net/docs/readme/debugging/

On Sat, Jan 11, 2020 at 3:44 AM Andrew R. DeFilippis < [email protected]> wrote:

While running on an RPi 4 with gpsd coord, a USB rtl8187 wireless adapter, and internal bluetooth:

Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)" Stack trace (most recent call last) in thread 22339: #1 Object ", at 0xb6c4f5af, in #0 Object "/usr/bin/kismet, at 0x3de0bf, in TerminationHandler() Uncaught exception "std::exception" FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down Aborted

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kismetwireless/kismet/issues/212?email_source=notifications&email_token=AFKJYY6CIDLOMDTXRCK6HTLQ5GBHFA5CNFSM4KFQ4ZQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IFP6SPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFKJYY2RA6HPHVAEX3O553LQ5GBHFANCNFSM4KFQ4ZQQ .

Jan 11 '20 16:01 kismetwireless

I've restarted it with gdb and will let it run. It was something like ~24-48h uptime when it crashed, so I'll wait for a repeat occurrence.

Jan 11 '20 17:01 andrewdefilippis

For disk I/O, I received most recently: what(): disk too slow for logging.

I'm not sure what would be causing the disk to be slow enough to elicit this response as nothing disk-heavy should be running on this RPi. The card is a Samsung 256GB 100MB/s (U3) MicroSDXC (MB-ME256GA/AM). gdb.txt

Jan 16 '20 05:01 andrewdefilippis

On running atop, it appears that the writes from the kismet application continue to increase in size over time (something with the db?). I'm watching as it has now increased to about 8MB in a 10s interval from <2MB.

I'm going to assume that it eventually increases to a size that takes too long to write to the SD card.

I'm now curious what should be optimized here on disk writes for implementation on SD card based systems. I can utilize an external HD, but that doesn't solve the problem of ever increasing disk write size and time.

I'm going to let atop run to see if it increases further over night/tomorrow.

Jan 16 '20 07:01 andrewdefilippis

I had something similar when running kismet for several days. I think there may be some kind of memory leak because memory grows and then after about a day, swap memory starts growing as well and eventually fills up. When it does, the error occurs. When I run using --no-logging, i will run a few days longer, and then same error occurs.

Jan 16 '20 13:01 MyProjectExpert

Kismet continually tracks data - the more devices seen the more memory it will use, unless you tune it to discard devices and associations over time as per the memory tuning options in the documentation.

Running it on a rpi is a busy environment is a significantly under-powered platform.

I have seen no indications of memory leaks in very long-running processes, but if you have any debug info and you think you've found a leak I'm happy to look at it. I've been running servers for months, though, with no unexpected memory gains.

On Thu, Jan 16, 2020 at 8:00 AM Michael Wharton [email protected] wrote:

I had something similar when running kismet for several days. I think there may be some kind of memory leak because memory grows and then after about a day, swap memory starts growing as well and eventually fills up. When it does, the error occurs. When I run using --no-logging, i will run a few days longer, and then same error occurs.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kismetwireless/kismet/issues/212?email_source=notifications&email_token=AAZAOCDYUFIMZKPIK72BFBTQ6BK7DA5CNFSM4KFQ4ZQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJD7EWY#issuecomment-575140443, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZAOCFJZ4KEESJHRWXXNZ3Q6BK7DANCNFSM4KFQ4ZQQ .

Jan 16 '20 13:01 dragorn

Same issue what(): disk too slow for logging with the follow-up debug execution: gdb-2.txt

From atop -r 20200117:

DSK | mmcblk0 | busy 14% | read 6228 | write 3570e4

  PID    SYSCPU     USRCPU     VGROW      RGROW     RDDSK      WRDSK    RUID         EUID        ST     EXC     THR     S    CPUNR      CPU    CMD         1/5
32351    80m31s      2h41m    582.6M     505.6M      316K     105.2G    pi           pi          N-       -       7     S        3       3%    kismet

That's a whole lot of disk writes... wtf?

-rw-r--r--  1 pi pi 212M Jan 17 14:12 Kismet-20200116-05-41-42-1.kismet
-rw-r--r--  1 pi pi  48M Jan 17 14:12 Kismet-20200116-05-41-42-1.kismet-journal

From atop -r 20200117 -b 14:20 which was the last minute atop recorded the application was still executing:

CPU | sys 10% | user 7% | irq 0% | idle 365% | wait 18% | guest 0% | ipc notavail | cycl unknown | avgf 751MHz | avgscal 50% |
MEM | tot 3.8G | free 2.0G | cache 770.2M | buff 54.4M | slab 41.3M | shmem 9.4M | shrss 0.0M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M |
SWP | tot 4.1G | free 4.1G |
DSK | mmcblk0 | busy 14% | read 0 | write 45254 | KiB/r 0 | KiB/w 10 | MBr/s 0.0 | MBw/s 0.8 | avq 2.32 | AVIO 1.89 ms |

  PID    SYSCPU     USRCPU     VGROW      RGROW     RDDSK      WRDSK    RUID         EUID        ST     EXC     THR     S    CPUNR      CPU    CMD        1/59
32351    17.25s     30.53s    -6740K     -4804K        0K     433.2M    pi           pi          --       -       8     t        0       8%    kismet

  PID                    TID                    RDDSK                    WRDSK                  WCANCL                    DSK                   CMD       1/59
32351                      -                       0K                   433.2M                      0K                   100%                   kismet
   74                      -                       0K                     136K                      0K                     0%                   jbd2/mmcblk0p2

NPROCS      SYSCPU      USRCPU       VSIZE        RSIZE       PSIZE      SWAPSZ        RDDSK       WRDSK       RNET        SNET       CPU      CMD         1/4
     1      17.25s      30.53s      854.2M       776.6M      774.0M          0K           0K      433.2M          0           0        8%      kismet

  PID    TID  MINFLT   MAJFLT  VSTEXT   VSLIBS   VDATA   VSTACK   VSIZE    RSIZE   PSIZE    VGROW   RGROW   SWAPSZ  RUID       EUID       MEM   CMD       1/60
32351      -  269256        0   4728K   13100K  829.5M     132K  854.2M   776.6M  774.0M   -6740K  -4804K       0K  pi         pi         20%   kismet
32086      -     744        0   5080K    9044K  45620K     132K  68700K   59364K  56661K    2520K   2440K       0K  pi         pi          1%   gdb

  PID   TID  PPID RUID     RGID     EUID     EGID     SUID     SGID     FSUID    FSGID      STDATE    STTIME    ENDATE    ENTIME  ST EXC S  CPU CMD       1/59
32351     - 32086 pi       pi       pi       pi       pi       pi       pi       pi       2020/01/15 21:41:40   active    active  --   - t   8% kismet

  PID     TID    SYSCPU    USRCPU     VGROW     RGROW     RDDSK     WRDSK    RUID        EUID        ST    EXC     THR    S    CPUNR     CPU    CMD        1/1
32351       -    17.25s    30.53s    -6740K    -4804K        0K    433.2M    pi          pi          --      -       8    t        0      8%    kismet
32351   32351     0.26s     0.35s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        0      0%    kismet
32351   32378     0.00s     0.00s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        2      0%    kismet
32351   32379     0.00s     0.00s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        1      0%    msgbus
32351   32380     0.00s     0.00s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        2      0%    eventbus
32351   32381     0.21s     0.51s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        3      0%    packethandler
32351   32423     0.00s     0.00s    -6740K    -4804K        0K        0K    pi          pi          --      -       1    t        2      0%    MHD-listen
32351   32424     4.06s     3.92s    -6740K    -4804K        0K    14380K    pi          pi          --      -       1    t        3      1%    timers
32351   14225     3.01s     6.56s    854.2M    776.6M        0K    103.8M    pi          pi          N-      -       1    t        0      2%    timers
12824       -     0.92s     0.11s    -53.6M    -7548K        0K        0K    pi          pi          --      -       1    Z        3      0%    kismet_cap_lin
12823       -     0.05s     0.02s    -12.0M     -628K        0K        0K    pi          pi          --      -       1    Z        3      0%    kismet_cap_lin
12823       -     0.00s     0.00s        0K        0K         -         -    pi          -           -E      0       0    E        -      0%    <kismet_cap_l>
12824       -     0.00s     0.00s        0K        0K         -         -    pi          -           -E      1       0    E        -      0%    <kismet_cap_l>

  PID                    TID                    RDDSK                    WRDSK                  WCANCL                    DSK                   CMD        1/1
32351                      -                       0K                   433.2M                      0K                   100%                   kismet
32351                  32351                       0K                       0K                      0K                     0%                   kismet
32351                  32378                       0K                       0K                      0K                     0%                   kismet
32351                  32379                       0K                       0K                      0K                     0%                   msgbus
32351                  32380                       0K                       0K                      0K                     0%                   eventbus
32351                  32381                       0K                       0K                      0K                     0%                   packethandler
32351                  32423                       0K                       0K                      0K                     0%                   MHD-listen
32351                  32424                       0K                   14380K                      0K                     3%                   timers
32351                  14225                       0K                   103.8M                      0K                    24%                   timers
12824                      -                       0K                       0K                      0K                     0%                   kismet_cap_lin
12823                      -                       0K                       0K                      0K                     0%                   kismet_cap_lin
12823                      -                        -                        -                       -                     0%                   <kismet_cap_l>
12824                      -                        -                        -                       -                     0%                   <kismet_cap_l>

It appears that the SD card is IOPS-limited in that kismet did 45254 write ops in < 60s before the process was terminated. I'd like to know what ramps up the write IOPS over time, and how to disable that. Is it DB optimization, or persistent_compression=true?

Thank you.

Jan 18 '20 05:01 andrewdefilippis

Narrowing this down a bit more, even with logging_enabled=false in kismet_site.conf, the disk writes continue, and write ops increase over time.

I'm thinking that this may be due to logging or tracking for just the HTTP server.

The logging_enabled=false configuration did not work properly in my site file to disable these writes. But launching kismet with the parameter -n to force logging disabled, stopped all disk writes from the application.

https://www.kismetwireless.net/docs/readme/logging/

logging_enabled=false

vs https://github.com/kismetwireless/kismet/blob/2ddf8ac56418f4dcaa88674ffa0473b242a9deda/logtracker.cc#L138

Jan 18 '20 06:01 andrewdefilippis

Moving on to disabling Kismet DB and utilizing only log_types=pcapng, my initial observation via atop is that write ops are low and mostly consistent at ~5 every 10s so far.

I will need to continue running kismet to determine if the ops increase.

Unlike the kismet log type, the pcapng type has not contributed to ever increasing write ops. It does appear to be an issue with the Kismet DB.

Jan 18 '20 07:01 andrewdefilippis

gps=gpsd:host=localhost,port=2947,reconnect=true
source=wlan1:name=adapter24ghs:type=rtl8187,channels="1,2,3,4,5,6,7,8,9,10,11,12,13"
source=hci0:name=bluetooth:type=linuxbluetooth

track_device_seenby_views=false
track_device_phy_views=false
manuf_lookup=false
track_device_rrds=false
kis_log_packets=false
kis_log_messages=false

#logging_enabled=false
#enable_logging=false

#log_types=pcapng

#dot11_view_accesspoints=false
#dot11_fingerprint_devices=false
#keep_datasource_signal_history=false
#keep_location_cloud_history=false
#tracker_device_timeout=60

I need to leave off tonight with my kismet_site.conf used while testing. The optimization config at the end did not eliminate the increasing write ops issue.

Jan 18 '20 08:01 andrewdefilippis

kismet kismet copied to clipboard

Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)"

kismet
kismet copied to clipboard