linux icon indicating copy to clipboard operation
linux copied to clipboard

Crash/lockup reading ttyACM device

Open craigerl opened this issue 1 year ago • 26 comments

Describe the bug

When PiZero2W is connected to an icom705 (ham radio) via USB, the following command instantly locks up Raspberry Pi OS Bookworm, with all patches applied. Console unresponsive, net-dead. Even a non-root user can crash the system. Activity light blinks once per second. No USB, eventually ext4 reports io errors. I suspect the usb/sd/wifi hardware stops at this point. Scheduler and at least systemd are still running.

sudo /usr/sbin/gpsd -n -b -N -D 2 /dev/ttyACM1

Not observed on a generic ublox gps. Not observed on a Pi5. PiZero2W locks up every time. No oops on console, kern.log, syslog.

A non root user can crash the the system simply by running gpsd ttyACM1.

The radio presents an audio device, a serial port and a gps serial device (ttyACM0, and ttyACM1 respectively).

Kernel bug introduced sometime after 6.1.74(jan24), as regressing to this kernel resolves the problem. rpi-update d86b5843d68b9972a5430a6d3da1b271cfc83521

Steps to reproduce the behaviour

PiZero2W, icom705 radio, gpsd.

sudo /usr/sbin/gpsd -n -b -N -D 2 /dev/ttyACM1

Device (s)

Raspberry Pi Zero 2 W

System

Raspberry Pi reference 2023-10-10 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, fb56ad562991cf3ae5c96ab50983e1deeaefc7b6, stage2

Aug 30 2024 19:19:24 Copyright (c) 2012 Broadcom version 2808975b80149bbfe86844655fe45c7de66fc078 (clean) (release) (start_cd)

Linux digipi 6.6.51+rpt-rpi-v7 #1 SMP Raspbian 1:6.6.51-1+rpt3 (2024-10-08) armv7l GNU/Linux

Logs

root@digipi:~# /usr/sbin/gpsd -n -b -N -D 9 /dev/ttyACM1
gpsd:WARN: This system has a 32-bit time_t.  This gpsd will fail at 2038-01-19T03:14:07Z.
gpsd:INFO: launching (Version 3.22)
gpsd:IO: opening IPv4 socket
gpsd:SPIN: passivesock_af() -> 3
gpsd:IO: opening IPv6 socket
gpsd:SPIN: passivesock_af() -> 4
gpsd:INFO: listening on port gpsd
gpsd:PROG: NTP: shmat(0,0,0) succeeded, segment 0
gpsd:PROG: NTP: shmat(1,0,0) succeeded, segment 1
gpsd:PROG: NTP: shmat(2,0,0) succeeded, segment 2
gpsd:PROG: NTP: shmat(3,0,0) succeeded, segment 3
gpsd:PROG: NTP: shmat(4,0,0) succeeded, segment 4
gpsd:PROG: NTP: shmat(5,0,0) succeeded, segment 5
gpsd:PROG: NTP: shmat(6,0,0) succeeded, segment 6
gpsd:PROG: NTP: shmat(7,0,0) succeeded, segment 7
gpsd:PROG: successfully connected to the DBUS system bus
gpsd:PROG: shmget(0x47505344, 24232, 0666) for SHM export succeeded
gpsd:PROG: shmat() for SHM export succeeded, segment 8
gpsd:INFO: stashing device /dev/ttyACM1 at slot 0
gpsd:PROG: no /etc/gpsd/device-hook present, skipped running ACTIVATE hook
gpsd:INFO: SER: opening read-only GPS data source type 3 and at '/dev/ttyACM1'

At this point the sdcard and usb interface is gone... systemd can still write to the console.

Additional context

Rolling back kernel to 6.1.74 (Jan 24) resolves the issue.

  • rpi-update d86b5843d68b9972a5430a6d3da1b271cfc83521

Problem observed with:

  • Raspberry Pi Zero2w
  • Raspberry Pi OS Bookworm, Linux 6.6.51+rpt-rpi-v7
  • Icom705 GPS on /dev/ttyACM1

Not a problem if i switch any one of these to

  • Pi5
  • ublox gps
  • kernel 6.1.74

https://gitlab.com/gpsd/gpsd/-/issues/303#note_2174127901

craigerl avatar Oct 24 '24 14:10 craigerl

tailing kernl.log and syslog during lockup provided no information

craigerl avatar Oct 24 '24 14:10 craigerl

Does the Icom705 GPS have it's own power supply? Have you tried putting the GPS behind a powered hub?

pelwell avatar Oct 24 '24 14:10 pelwell

Does the Icom705 GPS have it's own power supply? Have you tried putting the GPS behind a powered hub?

good questions, yes and yes. No change.

craigerl avatar Oct 24 '24 15:10 craigerl

Hi, I'm Gary Miller from the gpsd project.

I still don't understand:

the sdcard [,,,] is gone

Can it be read after a reboot?

garyemiller avatar Oct 24 '24 18:10 garyemiller

Hi, I'm Gary Miller from the gpsd project.

I still don't understand:

the sdcard [,,,] is gone

Can it be read after a reboot?

Yes, sorry for being vague. It reboots fine (until the gpsd service starts and the gps is connected).

It's as if the PiZero2W "io chip" stops the same exact moment gpsd starts. no wifi, no usb, no sd card.

systemd starts complaining (countdown) for three services which are timing out.

The kernel ext4 driver reports an io error on an inode

you can "cat /dev/ttyACM0" anb see the output just fine. It's something about how gpsd talks to ttyACM0 that kills sd/usb/wifi.

thanks, -craig

craigerl avatar Oct 24 '24 19:10 craigerl

Appears to be resolved with Kernel 6.6.58 15869f639ce259cca9d3857eb86316f963d4b1e9

craigerl avatar Oct 31 '24 14:10 craigerl

I take that back, it worked once. I'll play the high/low game and see if I can isolate which kernel commit broke ttyACM on PiZero2w using gpsd.

craigerl avatar Oct 31 '24 14:10 craigerl

Bug was introduced in this commit (kernel went from 6.1.x to 6.6.x)

kernel: Bump to 6.6.16 kernel from next branch
popcornmix committed on Feb 8
e632362b0399b4ce331aacd9386685bc60938ab7

craigerl avatar Oct 31 '24 17:10 craigerl

Oh that's good - only about 70,000 commits between the two....

pelwell avatar Oct 31 '24 17:10 pelwell

update: 6.12.20+rpt-rpi-v7 is working

edit: sorry, false negative, it's crashing like the others (gps wasn't locked yet)

craigerl avatar Apr 30 '25 13:04 craigerl

Problem exists with 6.12.25+rpt-rpi-v7 . "gpsd -n -b /dev/ttyACM1", or even just "gpsd /dev/ttyACM1" as a mortal user instantly locks up a PiZero2W. no hdmi, no wifi, no bluetooth, sd doesn't blink, all connections drop. "cat /dev/ttyACM1" works fine.

I'll go back and retest 6.12.20.

craigerl avatar May 29 '25 01:05 craigerl

Confirming kernel bug was introduced between 6.1.x series and 6.6.x series. When gpsd reads from ttyACM1, the PiZero2W (Pi5 is fine) locks up, green LED blinks 17 times (once per second), then either stays green or goes out. Everything suddenly stops, hdmi disconnects, no wifi, no death cry on console or kern.log, no hdmi either tho.

gpsd opens the device fine, but lockup doesn't happen until data is actually read.

cat /dev/ttyACM1 works fine. In fact, I'm using a named pipe as a workaround for gpsd on a PiZero2W).

Any user can take down a PiZero2W by simply running gpsd against a ttyACM device.

craigerl avatar May 30 '25 11:05 craigerl

Are you able to reproduce the lockup without a HAM radio costing ~£1500?

P33M avatar May 30 '25 12:05 P33M

Are you able to reproduce the lockup without a HAM radio costing ~£1500?

Not yet. a cheap ublox gps uses ttyACM and seems to work, it may have locked up once.

It feels like a racing condition. For example, if I run "strace gpsd" it works 90% of the time. without strace, it locks up 100%.

Since "cat /dev/ttyACM1" works, it might be the ACM driver not liking the speed/bits/parity setup.

craigerl avatar May 30 '25 12:05 craigerl

Reproduced on Pi3A+ and Pi3B+ (BCM2835 seems to be the common factor).

craigerl avatar May 30 '25 12:05 craigerl

Have you tried with dtoverlay=dwc2,dr_mode=host in config.txt? That will cause the Pi to use a slower but more battle-hardened USB driver.

pelwell avatar May 30 '25 13:05 pelwell

Have you tried with dtoverlay=dwc2,dr_mode=host in config.txt? That will cause the Pi to use a slower but more battle-hardened USB driver.

Added - it's working!!

Use this for all BCM2835 models?

[pi5], [cm5], [pi500], [pi4], [pi400], [cm4], [pi3], [pi2], [pi0], and [pi0w] (what is a Zero2W? )

thank you! really, i spent a lot of time tearing apart gpsd, and went through dozens of raspi-config kernels.

craigerl avatar May 30 '25 13:05 craigerl

It's most necessary on models without PCIe, i.e. Pis 1-3 and the Zeroes. Without the explicit dr_mode it should auto-detect when it is being used in gadget mode, but you can also force that with dr_mode=peripheral. Run dtoverlay -h dwc2 for full usage.

Newer Pis still have the old DWC OTG USB controller, but it is hooked up to the USB-C/power socket. Some applications put it into peripheral mode, .e.g the rpiboot mass_storage_gadget64 mode.

pelwell avatar May 30 '25 14:05 pelwell

Can someone look at this file and suggest some changes to explain the fix?

https://gitlab.com/gpsd/gpsd/-/blob/master/INSTALL.adoc?ref_type=heads

garyemiller avatar May 30 '25 18:05 garyemiller

Unlikely - there's little point; we won't be updating the dwc-otg driver.

pelwell avatar May 30 '25 18:05 pelwell

Can someone look at this file and suggest some changes to explain the fix?

https://gitlab.com/gpsd/gpsd/-/blob/master/INSTALL.adoc?ref_type=heads

If your Raspberry Pi locks up as gpsd starts, particularly on PiZero2/3B+/3A+, try using the alternate USB driver. Edit /boot/firmware/config.txt and add the line "dtoverlay=dwc2,dr_mode=host" It's a slower, but more battle-hardened.

craigerl avatar May 30 '25 18:05 craigerl

Unlikely - there's little point; we won't be updating the dwc-otg driver.

Which is exactly why gpsd needs to document the workaround. A lot of people have been hiit by this and blame gpsd for it.

garyemiller avatar May 30 '25 18:05 garyemiller

Oh, you mean suggest changes to them... your question was ambiguous.

pelwell avatar May 30 '25 18:05 pelwell

Oh, you mean suggest changes to them... your question was ambiguous.

"Them" is me. I'm with the gpsd project. Sorry that was not clear.

garyemiller avatar May 30 '25 18:05 garyemiller

The gpsd doc has been updated with the workaround. So from the gpsd angle, this is done.

If I was part of this project, I'd want it in the local doc as well.

garyemiller avatar Jun 12 '25 00:06 garyemiller

Same here on Pi3B+, raspbian 12 kernel 6.12.34+rpt-rpi-v7, connected to a LILYGO T-Beam. Everything touching /dev/ttyACM0 - including minicom - made the system crash badly. Couldn't wrap my head around this for days, until I found this issue. dtoverlay=dwc2,dr_mode=host did the trick. Thanks @pelwell !

andreworg avatar Aug 21 '25 07:08 andreworg