kismet
kismet copied to clipboard
Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)"
While running on an RPi 4 with gpsd coord, a USB rtl8187 wireless adapter, and internal bluetooth:
Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)"
Stack trace (most recent call last) in thread 22339:
#1 Object ", at 0xb6c4f5af, in
#0 Object "/usr/bin/kismet, at 0x3de0bf, in TerminationHandler()
Uncaught exception "std::exception"
FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down
FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down
Aborted
This is usually linked to either the 8187 drivers causing system-wide problems, or not having enough cpu or io to keep up.
I'd need a debug dump to try to track it down more - https://www.kismetwireless.net/docs/readme/debugging/
On Sat, Jan 11, 2020 at 3:44 AM Andrew R. DeFilippis < [email protected]> wrote:
While running on an RPi 4 with gpsd coord, a USB rtl8187 wireless adapter, and internal bluetooth:
Uncaught exception "deadlock: shared mutex lock not available within 15 (write held)" Stack trace (most recent call last) in thread 22339: #1 Object ", at 0xb6c4f5af, in #0 Object "/usr/bin/kismet, at 0x3de0bf, in TerminationHandler() Uncaught exception "std::exception" FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down FATAL - Capture source did not get PING from Kismet for over 15 seconds; shutting down Aborted
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kismetwireless/kismet/issues/212?email_source=notifications&email_token=AFKJYY6CIDLOMDTXRCK6HTLQ5GBHFA5CNFSM4KFQ4ZQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IFP6SPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFKJYY2RA6HPHVAEX3O553LQ5GBHFANCNFSM4KFQ4ZQQ .
I've restarted it with gdb and will let it run. It was something like ~24-48h uptime when it crashed, so I'll wait for a repeat occurrence.
For disk I/O, I received most recently: what(): disk too slow for logging.
I'm not sure what would be causing the disk to be slow enough to elicit this response as nothing disk-heavy should be running on this RPi. The card is a Samsung 256GB 100MB/s (U3) MicroSDXC (MB-ME256GA/AM).
gdb.txt
On running atop, it appears that the writes from the kismet application continue to increase in size over time (something with the db?). I'm watching as it has now increased to about 8MB in a 10s interval from <2MB.
I'm going to assume that it eventually increases to a size that takes too long to write to the SD card.
I'm now curious what should be optimized here on disk writes for implementation on SD card based systems. I can utilize an external HD, but that doesn't solve the problem of ever increasing disk write size and time.
I'm going to let atop run to see if it increases further over night/tomorrow.
I had something similar when running kismet for several days. I think there may be some kind of memory leak because memory grows and then after about a day, swap memory starts growing as well and eventually fills up. When it does, the error occurs. When I run using --no-logging, i will run a few days longer, and then same error occurs.
Kismet continually tracks data - the more devices seen the more memory it will use, unless you tune it to discard devices and associations over time as per the memory tuning options in the documentation.
Running it on a rpi is a busy environment is a significantly under-powered platform.
I have seen no indications of memory leaks in very long-running processes, but if you have any debug info and you think you've found a leak I'm happy to look at it. I've been running servers for months, though, with no unexpected memory gains.
On Thu, Jan 16, 2020 at 8:00 AM Michael Wharton [email protected] wrote:
I had something similar when running kismet for several days. I think there may be some kind of memory leak because memory grows and then after about a day, swap memory starts growing as well and eventually fills up. When it does, the error occurs. When I run using --no-logging, i will run a few days longer, and then same error occurs.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kismetwireless/kismet/issues/212?email_source=notifications&email_token=AAZAOCDYUFIMZKPIK72BFBTQ6BK7DA5CNFSM4KFQ4ZQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJD7EWY#issuecomment-575140443, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZAOCFJZ4KEESJHRWXXNZ3Q6BK7DANCNFSM4KFQ4ZQQ .
Same issue what(): disk too slow for logging with the follow-up debug execution:
gdb-2.txt
From atop -r 20200117:
DSK | mmcblk0 | busy 14% | read 6228 | write 3570e4
PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/5
32351 80m31s 2h41m 582.6M 505.6M 316K 105.2G pi pi N- - 7 S 3 3% kismet
That's a whole lot of disk writes... wtf?
-rw-r--r-- 1 pi pi 212M Jan 17 14:12 Kismet-20200116-05-41-42-1.kismet
-rw-r--r-- 1 pi pi 48M Jan 17 14:12 Kismet-20200116-05-41-42-1.kismet-journal
From atop -r 20200117 -b 14:20 which was the last minute atop recorded the application was still executing:
CPU | sys 10% | user 7% | irq 0% | idle 365% | wait 18% | guest 0% | ipc notavail | cycl unknown | avgf 751MHz | avgscal 50% |
MEM | tot 3.8G | free 2.0G | cache 770.2M | buff 54.4M | slab 41.3M | shmem 9.4M | shrss 0.0M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M |
SWP | tot 4.1G | free 4.1G |
DSK | mmcblk0 | busy 14% | read 0 | write 45254 | KiB/r 0 | KiB/w 10 | MBr/s 0.0 | MBw/s 0.8 | avq 2.32 | AVIO 1.89 ms |
PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/59
32351 17.25s 30.53s -6740K -4804K 0K 433.2M pi pi -- - 8 t 0 8% kismet
PID TID RDDSK WRDSK WCANCL DSK CMD 1/59
32351 - 0K 433.2M 0K 100% kismet
74 - 0K 136K 0K 0% jbd2/mmcblk0p2
NPROCS SYSCPU USRCPU VSIZE RSIZE PSIZE SWAPSZ RDDSK WRDSK RNET SNET CPU CMD 1/4
1 17.25s 30.53s 854.2M 776.6M 774.0M 0K 0K 433.2M 0 0 8% kismet
PID TID MINFLT MAJFLT VSTEXT VSLIBS VDATA VSTACK VSIZE RSIZE PSIZE VGROW RGROW SWAPSZ RUID EUID MEM CMD 1/60
32351 - 269256 0 4728K 13100K 829.5M 132K 854.2M 776.6M 774.0M -6740K -4804K 0K pi pi 20% kismet
32086 - 744 0 5080K 9044K 45620K 132K 68700K 59364K 56661K 2520K 2440K 0K pi pi 1% gdb
PID TID PPID RUID RGID EUID EGID SUID SGID FSUID FSGID STDATE STTIME ENDATE ENTIME ST EXC S CPU CMD 1/59
32351 - 32086 pi pi pi pi pi pi pi pi 2020/01/15 21:41:40 active active -- - t 8% kismet
PID TID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/1
32351 - 17.25s 30.53s -6740K -4804K 0K 433.2M pi pi -- - 8 t 0 8% kismet
32351 32351 0.26s 0.35s -6740K -4804K 0K 0K pi pi -- - 1 t 0 0% kismet
32351 32378 0.00s 0.00s -6740K -4804K 0K 0K pi pi -- - 1 t 2 0% kismet
32351 32379 0.00s 0.00s -6740K -4804K 0K 0K pi pi -- - 1 t 1 0% msgbus
32351 32380 0.00s 0.00s -6740K -4804K 0K 0K pi pi -- - 1 t 2 0% eventbus
32351 32381 0.21s 0.51s -6740K -4804K 0K 0K pi pi -- - 1 t 3 0% packethandler
32351 32423 0.00s 0.00s -6740K -4804K 0K 0K pi pi -- - 1 t 2 0% MHD-listen
32351 32424 4.06s 3.92s -6740K -4804K 0K 14380K pi pi -- - 1 t 3 1% timers
32351 14225 3.01s 6.56s 854.2M 776.6M 0K 103.8M pi pi N- - 1 t 0 2% timers
12824 - 0.92s 0.11s -53.6M -7548K 0K 0K pi pi -- - 1 Z 3 0% kismet_cap_lin
12823 - 0.05s 0.02s -12.0M -628K 0K 0K pi pi -- - 1 Z 3 0% kismet_cap_lin
12823 - 0.00s 0.00s 0K 0K - - pi - -E 0 0 E - 0% <kismet_cap_l>
12824 - 0.00s 0.00s 0K 0K - - pi - -E 1 0 E - 0% <kismet_cap_l>
PID TID RDDSK WRDSK WCANCL DSK CMD 1/1
32351 - 0K 433.2M 0K 100% kismet
32351 32351 0K 0K 0K 0% kismet
32351 32378 0K 0K 0K 0% kismet
32351 32379 0K 0K 0K 0% msgbus
32351 32380 0K 0K 0K 0% eventbus
32351 32381 0K 0K 0K 0% packethandler
32351 32423 0K 0K 0K 0% MHD-listen
32351 32424 0K 14380K 0K 3% timers
32351 14225 0K 103.8M 0K 24% timers
12824 - 0K 0K 0K 0% kismet_cap_lin
12823 - 0K 0K 0K 0% kismet_cap_lin
12823 - - - - 0% <kismet_cap_l>
12824 - - - - 0% <kismet_cap_l>
It appears that the SD card is IOPS-limited in that kismet did 45254 write ops in < 60s before the process was terminated. I'd like to know what ramps up the write IOPS over time, and how to disable that. Is it DB optimization, or persistent_compression=true?
Thank you.
Narrowing this down a bit more, even with logging_enabled=false in kismet_site.conf, the disk writes continue, and write ops increase over time.
I'm thinking that this may be due to logging or tracking for just the HTTP server.
The logging_enabled=false configuration did not work properly in my site file to disable these writes. But launching kismet with the parameter -n to force logging disabled, stopped all disk writes from the application.
https://www.kismetwireless.net/docs/readme/logging/
logging_enabled=false
vs https://github.com/kismetwireless/kismet/blob/2ddf8ac56418f4dcaa88674ffa0473b242a9deda/logtracker.cc#L138
Moving on to disabling Kismet DB and utilizing only log_types=pcapng, my initial observation via atop is that write ops are low and mostly consistent at ~5 every 10s so far.
I will need to continue running kismet to determine if the ops increase.
Unlike the kismet log type, the pcapng type has not contributed to ever increasing write ops. It does appear to be an issue with the Kismet DB.
gps=gpsd:host=localhost,port=2947,reconnect=true
source=wlan1:name=adapter24ghs:type=rtl8187,channels="1,2,3,4,5,6,7,8,9,10,11,12,13"
source=hci0:name=bluetooth:type=linuxbluetooth
track_device_seenby_views=false
track_device_phy_views=false
manuf_lookup=false
track_device_rrds=false
kis_log_packets=false
kis_log_messages=false
#logging_enabled=false
#enable_logging=false
#log_types=pcapng
#dot11_view_accesspoints=false
#dot11_fingerprint_devices=false
#keep_datasource_signal_history=false
#keep_location_cloud_history=false
#tracker_device_timeout=60
I need to leave off tonight with my kismet_site.conf used while testing. The optimization config at the end did not eliminate the increasing write ops issue.