can-utils icon indicating copy to clipboard operation
can-utils copied to clipboard

Candump timing issue with MCP2518FD on Raspberry Pi 5

Open filogold opened this issue 3 months ago • 4 comments

Hi,
I’m running into a timing problem while using MCP2518FD CAN FD controllers on a Raspberry Pi 5.

My setup:

  • Raspberry Pi 5 (16 GB RAM, CPU fixed at 2.4 GHz, load <20%)
  • Kernel 6.6.42-v8-16k+
  • Two Waveshare 2CH CAN FD HATs, connected on SPI0 and SPI1
  • Overlays configured like this: dtparam=spi=on dtoverlay=spi1-3cs dtoverlay=mcp251xfd,spi0-0,interrupt=25 dtoverlay=mcp251xfd,spi0-1,interrupt=13 dtoverlay=mcp251xfd,spi1-0,interrupt=24 dtoverlay=mcp251xfd,spi1-1,interrupt=23

What I’m doing is just logging CAN traffic (the Pi isn’t transmitting). I have a setup where one ECU sends a message on CAN1, and another ECU acts as a gateway and forwards it on CAN2. Using a Vector CANcase I can see the gateway delay is always <1ms and consistent.

On the Raspberry Pi instead:

  • With hardware timestamps (-H in candump) I initially see the correct order, but after some time the two channels drift apart and are no longer aligned.
  • With software timestamps, the relative delay between the original and the forwarded message is inconsistent (sometimes the forwarded one shows up first).

So basically the timing correlation between the two CAN interfaces is unreliable on the Pi.


Extra details I checked:

  • Driver in use: mcp251xfd
  • ethtool -g can0:
Coalesce parameters for can0:
Adaptive RX: n/a  TX: n/a
stats-block-usecs: n/a
sample-interval: n/a
pkt-rate-low: n/a
pkt-rate-high: n/a

rx-usecs: n/a
rx-frames: n/a
rx-usecs-irq: 0
rx-frames-irq: 1

tx-usecs: n/a
tx-frames: n/a
tx-usecs-irq: 0
tx-frames-irq: 1

rx-usecs-low: n/a
rx-frame-low: n/a
tx-usecs-low: n/a
tx-frame-low: n/a

rx-usecs-high: n/a
rx-frame-high: n/a
tx-usecs-high: n/a
tx-frame-high: n/a

CQE mode RX: n/a  TX: n/a
  • ethtool -c can0:
Ring parameters for can0:
Pre-set maximums:
RX:             96
RX Mini:        n/a
RX Jumbo:       n/a
TX:             16
Current hardware settings:
RX:             80
RX Mini:        n/a
RX Jumbo:       n/a
TX:             8
RX Buf Len:             n/a
CQE Size:               n/a
TX Push:        off
TCP data split: n/a
  • Tried moving interrupts to different CPUs with /proc/irq/*/smp_affinity, but every time I move one IRQ, all CAN interrupts move together — I can’t split them per channel.
  • The CAN networks are quite busy but still within a reasonable busload, and the Pi CPU has plenty of headroom (<20% usage).
  • Tried to increase the priority of the process without any change.

Question

  • Is this just a limitation of how the driver timestamps frames (when they’re read over SPI instead of when the IRQ fires)?
  • Would reducing the RX ring size help improve determinism, or is interrupt coalescing the main factor?
  • Is it normal that the hardware timestamping drift between two MCP2518FD controllers?
  • Is there a known way to distribute the interrupts across different cores on Raspberry Pi 5?

Any suggestions on how to get stable timing correlation between two CAN interfaces on the Pi would be very helpful. Thanks!

filogold avatar Sep 25 '25 21:09 filogold