esp-hal icon indicating copy to clipboard operation
esp-hal copied to clipboard

esp-backtrace does not print backtrace in certain situations.

Open t-moe opened this issue 1 year ago • 1 comments

I wanted to use esp_backtrace::arch::backtrace() to obtain a backtrace in my watchdog interrupt handler, so that I can print a backtrace of where the watchdog occurred. To my surprise the backtrace was empty (even though I have force-frame-pointers enabled).

Apparently the issue was that some stack-frames around the interrupt handling code where not 16-byte aligned, which would cause the printing to be aborted (due to esp_backtrace::is_valid_ram_addr returning false).

https://github.com/esp-rs/esp-hal/blob/e033162ffd1994c46c478de2de5ec1eed3869762/esp-backtrace/src/lib.rs#L214C1-L217C6

When I removed the 16-byte alignment check, I got a correct backtrace back until main.

I'm not sure if there are also other situations where the stackframes are not 16-byte aligned, or whether that's just because I'm inside an interrupt handler. I just wanted to share my discovery, not sure if its of any help to you.

Thank you

t-moe avatar Oct 03 '24 07:10 t-moe

Thanks - that's an interesting observation and getting a backtrace is especially useful for WDT interrupts (probably not for other interrupts)

bjoernQ avatar Oct 04 '24 06:10 bjoernQ

I had a look into this and found where we screw up the stack alignment: https://github.com/esp-rs/esp-hal/blob/8a23dbe1b6e10459e4d7456f2da96b4c406e7740/esp-riscv-rt/src/lib.rs#L724

That needs to get fixed

The other thing is that after thinking about it I'd say we will never be able to get a reliable backtrace to the where the interrupt got triggered.

e.g. code like this is quite commonly produced

			42002d6a:	715d                	add	sp,sp,-80
			42002d6c:	c686                	sw	ra,76(sp)
			42002d6e:	c4a2                	sw	s0,72(sp)
			42002d70:	c2a6                	sw	s1,68(sp)
			42002d72:	c0ca                	sw	s2,64(sp)
			42002d74:	de4e                	sw	s3,60(sp)
			42002d76:	dc52                	sw	s4,56(sp)
			42002d78:	da56                	sw	s5,52(sp)
			42002d7a:	0880                	add	s0,sp,80

It's not always the case that

  • the FP is always pointing to the "right" frame
  • the FP on the stack is set
  • the RA on the stack is set

The best thing that can be done is to take MEPC and RA from the TrapFrame to see where the interrupt triggered and what called that code

See https://github.com/esp-rs/esp-hal/blob/8a23dbe1b6e10459e4d7456f2da96b4c406e7740/esp-wifi/src/timer/riscv.rs#L30-L31 https://github.com/esp-rs/esp-hal/blob/8a23dbe1b6e10459e4d7456f2da96b4c406e7740/esp-wifi/src/timer/riscv.rs#L65

(the TrrapFrame is not available for "delegated" interrupt handlers like the GPIO handler set via https://docs.esp-rs.org/esp-hal/esp-hal/0.21.1/esp32c6/esp_hal/trait.InterruptConfigurable.html#tymethod.set_interrupt_handler )

I will submit a PR for the change in esp-riscv-rt later

bjoernQ avatar Oct 29 '24 11:10 bjoernQ