blink1
blink1 copied to clipboard
Kernel crash on Linux
It happened 4 times, happens only when I'm testing the blink1, I'm confident that that's the cause. The gist contains the content of /var/log/messages during the latest crash.
https://gist.github.com/riquito/5c48037b6929bacafdf7
It may be linked to the unplug of the device, but I'm not sure. I'll try to see if I can replicate reliably (not that I have fun crashing my pc :-P)
edit: Fedora 21, x86_64, kernel 3.17.4-301
Me too. Multiple crashes, multiple Fedora-21 machines. Hard crash of kernel requiring reboot. Sometimes on first use of blink1. Semi-reliable crash on first use after removing and reinserting blink1 in different USB port.
kernel-3.17.7-300.fc21.x86_64
(How did riquito get /var/log/messages? Isn't it in journactl now?)
journalctl shows:
Jan 09 01:48:25 mldt kernel: thingm 0003:27B8:01ED.0006: hidraw2: USB HID v1.01 Device [ThingM blink(1) mk2] on usb-0000:00:1d.0-1.2/input0
Jan 09 01:48:25 mldt mtp-probe[2853]: checking bus 2, device 7: "/sys/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2"
Jan 09 01:48:25 mldt mtp-probe[2853]: bus: 2, device: 7 was not an MTP device
Jan 09 01:48:28 mldt kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
Jan 09 01:48:28 mldt kernel: IP: [
(( The previous comment is missing data. Apparently I can't paste text containing < or > . Also, the kernel guys won't like the tainted kernel from this machine. I can grab this again from different host, if needed? ))
Hi! What is the exact distro and Linux version you are using? What are you exact command-line commands you are using to control the blink(1)? Are you using a pre-compiled blink1-tool, or something else? If you are using a pre-compiled binary, what is the download URL of that binary?
mldt:~$ cat /etc/redhat-release Fedora release 21 (Twenty One) mldt:~$ uname -a Linux mldt 3.17.7-300.fc21.x86_64 #1 SMP Wed Dec 17 03:08:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
mldt:commandline (master)$ git pull
Already up-to-date.
mldt:commandline (master)$ make EXEFLAGS=
Building for OS=linux BLINK1_VERSION=v1.95-linux-x86_64
cc -shared -o libblink1.so pkg-config libusb-1.0 --libs
-lrt -lpthread -ldl -DUSE_HIDAPI -I./hidapi/hidapi pkg-config libusb-1.0 --cflags
-fPIC -std=gnu99 -g -DBLINK1_VERSION="""v1.95"-linux-"x86_64""" ./hidapi/libusb/hid.o blink1-lib.o pkg-config libusb-1.0 --libs
-lrt -lpthread -ldl
I was just using simple commands like:
blink1-tool --on blink1-tool --off
remove and reinsert blin1 in different usb port
blink1-tool --on
On a different host (work) same distro and os (not-tainted) I tried running under gdb.
I got no useful information when crashed. It's just a total instantaneous system lockup.
The only visible oddity in journalctl was this error message when the blink1 was removed, but the crash didn't occur until the device was reinserted and a command sent, none of which made it into the logs.
Jan 09 16:14:11 work kernel: usb 1-1.3: USB disconnect, device number 3 Jan 09 16:14:11 work systemd-udevd[23533]: error opening USB device 'descriptors' file -- Reboot --
This seems like an issue with the USB drivers in Fedora, or somehow HIDAPI (the library we use to talk to blink(1)) or libusb (the library HIDAPI uses) is tickling some problem further down the software stack.
Is there a different non-Fedora 21 you can try? Or, is there a list of changes between Fedora 18 (the last I tried) and Fedora 21?
I experience a crash too. I'm not using Fedora but Debian Jessie:
henning@henning-laptop:~$ uname -a
Linux henning-laptop 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 GNU/Linux
Here is the dmesg output: https://gist.github.com/hprid/2f2f3063abc3c2bf16de#file-dmesg-blink1-txt-L33
I can't reproduce it exactly, but is has something to do with unplugging/replugging the device.
After some more research I found that the issue seems to be fixed in newer kernel versions with commit 67a97845830f79584c9db8849ac723e5d2d57f65, which is not present in Debian Jessie. After rebuilding the kernel with this patch I no longer can reproduce the issue.
Nice catch @hprid The patch should be available from Kernel 3.17 onwards if I'm not mistaken
Thanks for the patch info, that's very interesting.
I think that kernel patch only applies if you're using the blink(1) kernel driver. At least one person commenting on this issue was using the userspace blink1-tool.
If anyone using blink1-tool is having a crash issue (which for the life of me I can't see how a userspace prog should crash Linux nowadays), I would like to enlist them to try two tests:
- If you hadn't already, compile blink1-tool on their own system by checking out the repo and doing
cd blink1/commandline && make
. I've seen USB-based binaries "mostly work" across distros but act flakey. Maybe this is a version of that flakey. - Try using a hidraw build of blink1-tool instead of libusb. Do this with
cd blink1/commandline && make clean && make USBLIB_TYPE=HIDDATA
.
The (2) above is more of a work-around than a solution but may at least not tickle the bug since it's using a different low-level USB API.
The blink1-tool triggered the kernel crash for me, I also tried recompiling blink1-tool, which crashed the kernel too. The easiest way to crash the kernel in a reproducible way (before applying the mentioned patch) was:
while true; do ./blink1-tool --list; done
and replugging the blink1 several times. Just replugging it several times without running blink1-tool doesn't crash the kernel.
Haven't tried a hidraw build, but can test it tomorrow if you like.
@riquito The patch isn't in 3.17, last commit regarding the patched file on 3.17.8 is e4aecaf2f53bc6635b484ee2f1b8a1e4c73e7997 (Tue Jun 3 13:29:38 2014 +0200). First kernel version with the patch is 3.18.
I think this is the same issue (although for me it always crashes).
I'm using Debian Wheezy (3.16). I have the udev rules setup.
Following todbot's instructions from above when I do:
1 - The build compiles and I can run ./blink1-tool
however whenever I send a command to the device such as a simple ./blink1-tool --on
my whole system completely freezes. Note it does (mostly) send the command to the blink light but the only means of recovery is to turn off at the power button. I've repeated this a number of times and it is repeatable.
2 - When I do 'make USBLIB_TYPE=HIDDATA` it compiles but I can't send anything to the blink light. See the errors encountered below for various commands.
`joseph@pixel:~/dev-home/blink1/commandline$ ./blink1-tool --list
blink(1) list:
id:0 - serialnum:
(Listing not supported in HIDDATA builds)
joseph@pixel:~/dev-home/blink1/commandline$ ./blink1-tool --on
set dev:0 to rgb:0xff,0xff,0xff over 300 msec
Error sending message: error sending control message: Device or resource busy
joseph@pixel:~/dev-home/blink1/commandline$ ./blink1-tool --on
set dev:0 to rgb:0xff,0xff,0xff over 300 msec
Error sending message: error sending control message: Device or resource busy
joseph@pixel:~/dev-home/blink1/commandline$ sudo ./blink1-tool --on
[sudo] password for joseph:
set dev:0 to rgb:0xff,0xff,0xff over 300 msec
Error sending message: error sending control message: Device or resource busy
joseph@pixel:~/dev-home/blink1/commandline$ sudo ./blink1-tool --on -v
deviceId[0] = 0
cached list:
0: serial: '' ''
openById: 0
set dev:0 to rgb:0xff,0xff,0xff over 300 msec
Error sending message: error sending control message: Device or resource busy `
Any and all suggestions welcome. I can pull out logs if you tell me what to look for and/or where to go. This is a very vanilla Debian Wheezy installation.
Thanks, Joseph