artiq
artiq copied to clipboard
Panic at runtime/rtio_mgt.rs after repeated DMA usage
Bug Report
One-Line Summary
Kasli panics sometimes when doing repeated DMA calls to a channel on a satellite.
Issue Details
Steps to Reproduce
My system has a single satellite. This satellite has a TTLOut
that is used to run this function: https://gitlab.com/duke-artiq/dax/-/blob/master/dax/modules/rtio_benchmark.py#L479. This is a benchmark we use to measure performance of the Kasli. Sometimes the Kasli panics when we run that code.
The panic was not observed when running the same test on a TTLOut
on the master, though I cannot test that exhaustively.
The panic is visible in the UART logs of the master. The satellite does not report any UART messages when the master panics, except that the link is lost.
Expected Behavior
No panic.
Actual (undesired) Behavior
Panic.
[2023-06-29 16:40:31] [ 771.652731s] INFO(runtime::session): no connection, starting idle kernel
[2023-06-29 16:40:31] [ 771.736134s] INFO(runtime::kern_hwreq): resetting RTIO
[2023-06-29 16:40:31] [ 771.790687s] INFO(runtime::session): new connection from 192.168.1.100:33904
[2023-06-29 16:40:31] panic at runtime/rtio_mgt.rs:51:29: called `Result::unwrap()` on an `Err` value: Interrupted
[2023-06-29 16:40:31] backtrace for software version 7.8173.ff97675;[removed gateware id]:
[2023-06-29 16:40:31] 0x4003d29c
[2023-06-29 16:40:31] 0x4000af10
[2023-06-29 16:40:31] 0x4000a510
[2023-06-29 16:40:31] 0x40028088
[2023-06-29 16:40:31] 0x40027eb0
[2023-06-29 16:40:31] 0x400211d8
[2023-06-29 16:40:31] 0x40024e80
[2023-06-29 16:40:31] 0x4002cc4c
[2023-06-29 16:40:31] 0x4000e7c4
[2023-06-29 16:40:31] 0x4000e7b4
[2023-06-29 16:40:31] 0x40038c94
[2023-06-29 16:40:31] 0x4000770c
[2023-06-29 16:40:31] 0x40028f40
[2023-06-29 16:40:31] 0x40028ee8
[2023-06-29 16:40:31] 0x4003c414
[2023-06-29 16:40:31] halting.
[2023-06-29 16:40:31] use `artiq_coremgmt config write -s panic_reset 1` to restart instead
Your System (omit irrelevant parts)
- Operating System: Linux (Ubuntu 22.04)
- ARTIQ version: 7.8173.ff97675
- Version of the gateware and runtime loaded in the core device: same as ARTIQ version
- Hardware involved: 2x Kasli 2.0 with master-satellite configuration, a DIO card on the satellite