openpilot icon indicating copy to clipboard operation
openpilot copied to clipboard

Panda firmware flash failing

Open jyoung8607 opened this issue 2 years ago • 1 comments

Describe the bug

Panda update to master-current firmware failed. Not 100% sure it was a DFU-required update, as it wasn't a new C3, but in all previous logs it was running release3. Owner reported ignition off but the device was still in a NO PANDA state. Found the following being spammed to console:

OSError: [Errno 24] Too many open files
common/params.cc: Failed to lock file /data/params/.lock, errno=24
pandad.uncaught_exception
Traceback (most recent call last):
  File "/data/openpilot/selfdrive/boardd/pandad.py", line 91, in main
    dfu_serials = PandaDFU.list()
  File "/data/openpilot/panda/python/dfu.py", line 64, in list
  File "/data/openpilot/panda/python/dfu.py", line 85, in spi_list
  File "/data/openpilot/panda/python/dfu.py", line 51, in spi_connect
  File "/data/openpilot/panda/python/spi.py", line 206, in __init__
  File "/data/openpilot/panda/python/spi.py", line 75, in __init__
OSError: [Errno 24] Too many open files
common/params.cc: Failed to lock file /data/params/.lock, errno=24
pandad.uncaught_exception
Traceback (most recent call last):
  File "/data/openpilot/selfdrive/boardd/pandad.py", line 91, in main
    dfu_serials = PandaDFU.list()
  File "/data/openpilot/panda/python/dfu.py", line 64, in list
  File "/data/openpilot/panda/python/dfu.py", line 85, in spi_list
  File "/data/openpilot/panda/python/dfu.py", line 51, in spi_connect
  File "/data/openpilot/panda/python/spi.py", line 206, in __init__
  File "/data/openpilot/panda/python/spi.py", line 75, in __init__
OSError: [Errno 24] Too many open files
common/params.cc: Failed to lock file /data/params/.lock, errno=24
pandad.uncaught_exception
Traceback (most recent call last):
  File "/data/openpilot/selfdrive/boardd/pandad.py", line 91, in main
    dfu_serials = PandaDFU.list()
  File "/data/openpilot/panda/python/dfu.py", line 64, in list
  File "/data/openpilot/panda/python/dfu.py", line 85, in spi_list
  File "/data/openpilot/panda/python/dfu.py", line 51, in spi_connect
  File "/data/openpilot/panda/python/spi.py", line 206, in __init__
  File "/data/openpilot/panda/python/spi.py", line 75, in __init__
OSError: [Errno 24] Too many open files

Provide a route where the issue occurs

214a25adc54636fc|2023-05-27--19-24-21

openpilot version

3f61537d1b7b33ee0a1104ca6a47bf323bf31c4b plus a fingerprint update

Additional info

After seeing the above, I manually stopped and restarted openpilot processes. It looks like it started the firmware flash, but didn't definitively proceed or finish past a certain point.

configuring modem
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
Panda 260057001751393037323631 connected, version: bootstub, signature , expected 20351be012899452
flash: unlocking
flash: erasing sectors 1 - 4
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
flash: flashing
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
flash: resetting
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
No IRQs found for 'xhci-hcd:usb3'
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
logmessaged timezoned ui soundd deleter pandad thermald tombstoned updated uploader statsd
...
[continued on for a minute or two without apparent progress
...

After that, I stopped openpilot processes and launched pandad alone, which seems like it got the job done.

comma@tici:/data/openpilot$ selfdrive/boardd/pandad.py
Panda 260057001751393037323631 connected, version: bootstub, signature , expected 20351be012899452
flash: unlocking
flash: erasing sectors 1 - 4
flash: flashing
flash: resetting
programming 0 with length 2048
programming 1 with length 2048
programming 2 with length 2048
programming 3 with length 2048
programming 4 with length 2048
programming 5 with length 2048
programming 6 with length 2048
programming 7 with length 2048
flash: unlocking
flash: erasing sectors 1 - 4
flash: flashing
flash: resetting
selfdrive/boardd/main.cc: starting boardd
selfdrive/boardd/boardd.cc: attempting to connect
selfdrive/boardd/spi.cc: transfer failed, after 7 tries, 701.78ms
selfdrive/boardd/spi.cc: transfer failed, after 7 tries, 701.89ms
selfdrive/boardd/panda.cc: conntected to 260057001751393037323631 over USB
selfdrive/boardd/boardd.cc: connected to board

jyoung8607 avatar May 28 '23 03:05 jyoung8607

In case it matters, we later determined this device/vehicle installation had intermittent connectivity issues to the OBD CAN bus, so sketchy connections and cables are in-play as potential triggers for entering that initial loop.

jyoung8607 avatar Jun 01 '23 15:06 jyoung8607

Wrote a test case https://github.com/commaai/openpilot/pull/28330, but couldn't repro with it. If this is somehow flaky, we'll see it in random CI failures going forward.

adeebshihadeh avatar Jun 08 '23 03:06 adeebshihadeh