esp-hal
esp-hal copied to clipboard
Panic during interrupt handling in esp32c3 timer_interrupt.rs example
Hello, I've been using the esp-rs and esp32c3_hal crates to write code for an esp32c3 board. I'm trying out the timer_interrupt example here https://github.com/esp-rs/esp-hal/blob/main/esp32c3-hal/examples/timer_interrupt.rs.
Here's steps for a minimal example:
- First I
cargo generateed on the esp-rs template project, setting up a project for the esp32c3. - I set the esp32c3 dependency to a known recent revision in cargo.toml:
esp32c3-hal = { git = "https://github.com/esp-rs/esp-hal", features = ["direct-boot"], rev = "da3ec47b3" }
(This was done this way because in my actual project (not this example) the 0.2.0 version had not been released yet, and had previously been depending on main, but that got updated significantly since the last time I was working on this, so I tied the revision to the latest at the time I was looking at this again)
- In the generated project:
mkdir examples, andtouch timer_interrupts.rs - From the esp32c3 timer_interrupt example above, copy all that code and paste into the created
timer_interruptsfile in step 3. - run
cargo build --example timer_interrupts && espflash target/riscv32imc-unknown-none-elf/debug/examples/timer_interrupts --monitor --format direct-bootThis should build and then flash the esp32 with the interrupt example, then monitor the output using the espflash tool - The timer seems to go off, causing a panic:
ESP-ROM:esp32c3-api1-20210207
Build:Feb 7 2021
rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT)
!! A panic occured in '/home/dougli1sqrd/.cargo/registry/src/github.com-1ecc6299db9ec823/compiler_builtins-0.1.79/src/int/shift.rs', at line 15, column 39
PanicInfo {
payload: Any { .. },
message: Some(
attempt to subtract with overflow,
),
location: Location {
file: "/home/dougli1sqrd/.cargo/registry/src/github.com-1ecc6299db9ec823/compiler_builtins-0.1.79/src/int/shift.rs",
line: 15,
col: 39,
},
can_unwind: true,
}
Backtrace:
0x42006b3a
0x42006b3a - core::panicking::panic
at /home/dougli1sqrd/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:48
0x420073ea
0x420073ea - compiler_builtins::int::shift::Ashl::ashl
at /home/dougli1sqrd/.cargo/registry/src/github.com-1ecc6299db9ec823/compiler_builtins-0.1.79/src/int/shift.rs:15
0x40380216
0x40380216 - esp_hal_common::interrupt::vectored::get_configured_interrupts
at /home/dougli1sqrd/.cargo/git/checkouts/esp-hal-42ec44e8c6943228/da3ec47/esp-hal-common/src/interrupt/riscv.rs:267
0x4200030c
0x4200030c - _start_trap_hal
at ??:??
0x420010f4
0x420010f4 - critical_section::release
at /home/dougli1sqrd/.cargo/registry/src/github.com-1ecc6299db9ec823/critical-section-1.1.1/src/lib.rs:197
This is happening in https://github.com/esp-rs/esp-hal/blob/main/esp-hal-common/src/interrupt/riscv.rs#L267, at the left shift.
Has anyone experienced an issue like this? Is this a bug in esp-hal, or the compiler_builtins? Do I just need to configure something easy to fix it?
Here is the actual template project with the failing example: https://github.com/rustbox/esp32-examples
Let me know if you have any ideas, thanks!
Using this toolchain in the esp32c3-hal directory: rust-toolchain.toml
[toolchain]
# The default profile includes rustc, rust-std, cargo, rust-docs, rustfmt and clippy.
# https://rust-lang.github.io/rustup/concepts/profiles.html
profile = "default"
channel = "nightly"
components = [ "rust-src" ]
targets = [ "riscv32imac-unknown-none-elf" ]
and then running from the esp32c3-hal directory:
cargo build --features direct-boot --example timer_interrupt && espflash target/riscv32imc-unknown-none-elf/debug/examples/timer_interrupt --monitor --format direct-boot
it actually works!
Build:Feb 7 2021
rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT)
Interrupt 1
Interrupt 11
Interrupt 1
Interrupt 1
Interrupt 11
Thanks for reporting this!
I am a bit confused since the toolchain you added says targets = [ "riscv32imac-unknown-none-elf" ] but your command-line says target/riscv32imc-unknown-none-elf/debug/examples/timer_interrupt ... but it's quite early in the morning and I might miss something obvious.
Anyway, on nightlies there might be mis-compilations even for RV32
It's not really clear to me which rustc was used to create to faulty binary (i.e. what is the default toolchain or override)
I double checked it with the esp-hal repository at commit 7889a992d7efcbbba0a5c7ef679694877f30380e with rustc 1.65.0-nightly (1120c5e01 2022-09-08) without experiencing this crash - not in boot-loader mode and not in direct-boot mode
I can't seem to reproduce this either.
oh yeah, that's interesting. I don't remember why the target is riscv32imac-unknown-none-elf when the esp32c3 is clearly riscv32imc-unknown-none-elf. Switching doesn't seem to help.
I'm running rustc 1.66.0-nightly (2019147c5 2022-09-19)
One thing I did find though was I think this is somehow being caused by opt-level = "z". In the example repo I linked above I have
[profile.dev]
debug = true # Symbols are nice and they don't increase the size on Flash
opt-level = "z"
And the esp-rs crate does not. So I removed the opt-level in the dev profile and it works in the example! In fact, running it with
[profile.dev]
debug = true # Symbols are nice and they don't increase the size on Flash
opt-level = "s"
seems to work.
And again, running cargo build --example timer_interrupts && espflash target/riscv32imc-unknown-none-elf/debug/examples/timer_interrupts --monitor --format direct-boot to build and flash the board in https://github.com/rustbox/esp32-examples
Thanks!
Thanks for the update! The problem with opt-level "z" is really interesting. I'm a bit surprised to see those problems with the RV32 targets but something like that can happen on nightlies
Regarding imac vs imc - yes ESP32-C3 doesn't have atomics but we have an atomic instruction emulation layer so you can target imac with some performance penalties - however if possible I'd say that sticking to imc and using solutions like atomic-polyfill is preferable
I've been looking into this a bit with @dougli1sqrd , and the panic seems to be limited to this shift operation: https://github.com/esp-rs/esp-hal/blob/7889a992d7efcbbba0a5c7ef679694877f30380e/esp-hal-common/src/interrupt/riscv.rs#L267
I can reliably reproduce an underflow panic while shifting with this example:
#![no_std]
#![no_main]
#[allow(unused)]
use esp32c3_hal::prelude::*;
use esp_backtrace as _;
use riscv_rt::entry;
#[entry]
fn main() -> ! {
let i = 0;
esp_println::println!("1u64 << {} = {}", i, 1u64 << i);
esp_println::println!("1u128 << {} = {}", i, 1u128 << i);
loop {}
}
The panic still only occurs with opt-level="z", but it happens across a few different nightlies (including the one from September 8th mentioned above). opt-level="z" is also the only way I've found to get the compiler to emit calls to the intrinsics crate: the rest of the time it's happily using 32-bit sll (and branching, and slr) to implement the 128-bit shift.
However, in opt-level="z", the compiler is using 64-bit shifts via compiler_builtins::int::shift::Ashl::ashl (and branching, and compiler_builtins::int::shift::Lshr::lshr). The way it's breaking up the 128-bit shift into multiple 64-bit shifts end computing 1u64 << i - 64, hence the underflow when i <= 63. Unfortunately for us, I think any esp32c3 interrupt number we might see will indeed be <= 63.
So, this leads me to a pair of questions:
- This definitely seems like an upstream bug, but I'm not 100% sure where: I intend to keep looking, but if one of you immediately goes "ah, that should be filed against rustc" I'd sure be happy to learn which of the dinner guests is guilty of murder here
- It looks like the esp32 riscv interrupt handler doesn't actually need 128-bit vectors as of yet: from my reading there's only two 32-bit interrupt registers, which would fit nicely in a
u64, and indeed: when I changed the types around ininterrupt/riscv.rs
Thanks for the detailed analysis! Good job! The reason why get_status uses u128 is that on the Xtensa based chips we indeed need more than 64 bits and we want to keep things similar. Wouldn't be the end of the world if we would change it for ESP32-C3 but might be annoying.
I noticed you already opened an issue on rust-lang - thanks a lot!
Looks like it's ultimately a LLVM bug: https://github.com/llvm/llvm-project/issues/57988
I've got two ideas of varying quality:
- As mentioned, change the esp32c3 interrupt handler to operate over
u64s - Teach
esp32c3/build.rsto recognize and rejectopt-level="z"(I've got a proof of concept that I'd be happy to turn into a PR)
It's your project, though, so I'll gladly defer to you on which maintenance burden you'd rather bear, if either!
Thanks for the update!
I'd personally vote for the second option especially since this problem might also get triggered in user-code.
Ooh, good call about user code: I'll reworded the message to include some of that flavor and open a PR!
Thanks for all your help and encouragement navigating the wild & woolly world of embedded rust, I appreciate it.
I think the proof of concept looks okay. I'm not entirely convinced it should be behind a feature, and I think printing a warning and continuing is probably preferable to terminating. Based on my limited testing opt-level = "z" does work for a number of examples, so I think this case should be treated as the exception and not the norm. Happy to hear any arguments for alternatives, of course.
Oops, just saw your comment @jessebraham : unless you object, I'll move discussion over to the PR I just opened.
I guess this is resolved now - reopen if you don't agree