embassy icon indicating copy to clipboard operation
embassy copied to clipboard

embassy-rp critical-section-impl lock not released when flashing with probe-run/probe-rs

Open Jajcus opened this issue 2 years ago • 4 comments

My Rust program would often refuse to start with probe-run and I would have to power cycle device, sometimes multiple times, to start it. The problem was more and more frequent as my program grew bigger.

Eventually I found, that it was freezing when entering critical section, e.g. in embassy_rp::init(), or just info!("something").

It seem that when probe-rs resets the machine while in critical section the lock is never cleared which causes a dead-lock on next start.

Minimal code to trigger this:

#![no_std]
#![no_main]
#![feature(type_alias_impl_trait)]

use defmt::info;

use embassy_rp;
use embassy_executor::Spawner;

use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let _ = embassy_rp::init(Default::default());

    loop {
        info!("ping!");
    }
}

with Cargo.toml:

[package]
edition = "2021"
name = "test"
version = "0.1.0"

[dependencies]
embassy-embedded-hal = { version = "0.1.0", path = "embassy/embassy-embedded-hal", features = ["defmt"] }
embassy-executor = { version = "0.2.0", path = "embassy/embassy-executor", features = ["nightly", "arch-cortex-m", "executor-thread", "executor-interrupt", "defmt", "integrated-timers"] }
embassy-time = { version = "0.1.2", path = "embassy/embassy-time", features = ["nightly", "unstable-traits", "defmt", "defmt-timestamp-uptime"] }
#embassy-rp = { version = "0.1.0", path = "embassy/embassy-rp", features = ["defmt", "unstable-traits", "nightly", "unstable-pac", "time-driver"] }
embassy-rp = { version = "0.1.0", path = "embassy/embassy-rp", features = ["defmt", "unstable-traits", "nightly", "unstable-pac", "time-driver", "critical-section-impl"] }

cortex-m = { version = "0.7.7", features = ["inline-asm"] }
#cortex-m = { version = "0.7.7", features = ["inline-asm", "critical-section-single-core"] }
cortex-m-rt = "0.7.3"

defmt = "0.3"
defmt-rtt = "0.4.0"
panic-probe = { version = "0.3", features = ["print-defmt"] }

[profile.release]
debug = true

and .cargo/config.toml:

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
runner = "probe-run --chip RP2040 --shorten-paths"
#runner = "probe-rs run --chip RP2040"

[build]
target = "thumbv6m-none-eabi"

[env]
DEFMT_LOG = "info"

On the first run the program works (spamming 'ping!') on next cargo run (or some later invocation) it won't start. Interrupting with Ctrl-C shows where it hung:

jajcus@jajco:~/tmp/test$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.06s
     Running `probe-run --chip RP2040 --shorten-paths target/thumbv6m-none-eabi/debug/test`
(HOST) INFO  flashing program (12 pages / 48.00 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
^C────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: core::sync::atomic::compiler_fence
   1: cortex_m::interrupt::disable
        at /home/jajcus/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cortex-m-0.7.7/src/interrupt.rs:39:2
   2: embassy_rp::critical_section_impl::RpSpinlockCs::acquire
        at embassy/embassy-rp/src/critical_section_impl.rs:52:17
   3: <embassy_rp::critical_section_impl::RpSpinlockCs as critical_section::Impl>::acquire
        at embassy/embassy-rp/src/critical_section_impl.rs:29:6
   4: _critical_section_1_0_acquire
        at /home/jajcus/.cargo/registry/src/index.crates.io-6f17d22bba15001f/critical-section-1.1.1/src/lib.rs:280:10
   5: critical_section::acquire
        at /home/jajcus/.cargo/registry/src/index.crates.io-6f17d22bba15001f/critical-section-1.1.1/src/lib.rs:180:18
   6: critical_section::with
        at /home/jajcus/.cargo/registry/src/index.crates.io-6f17d22bba15001f/critical-section-1.1.1/src/lib.rs:223:26
   7: atomic_polyfill::polyfill::AtomicU8::fetch_update
        at /home/jajcus/.cargo/registry/src/index.crates.io-6f17d22bba15001f/atomic-polyfill-1.0.3/src/polyfill.rs:134:17
   8: <embassy_rp::timer::TimerDriver as embassy_time::driver::Driver>::allocate_alarm
        at embassy/embassy-rp/src/timer.rs:47:18
   9: _embassy_time_allocate_alarm
        at embassy/embassy-time/src/driver.rs:162:13
  10: embassy_time::driver::allocate_alarm
        at embassy/embassy-time/src/driver.rs:134:5
  11: embassy_executor::raw::SyncExecutor::new
        at embassy/embassy-executor/src/raw/mod.rs:358:38
  12: embassy_executor::raw::Executor::new
        at embassy/embassy-executor/src/raw/mod.rs:503:20
  13: embassy_executor::arch::thread::Executor::new
        at embassy/embassy-executor/src/arch/cortex_m.rs:42:24
  14: test::__cortex_m_rt_main
        at src/main.rs:12:1
  15: main
        at src/main.rs:12:1
  16: Reset
(HOST) INFO  device halted by user

The problem does not appear when I use "critical-section-single-core" feature of cortex-m instead of "critical-section-impl" from embassy-rp (which I need for multicore program).

Jajcus avatar Aug 03 '23 09:08 Jajcus

This is a known issue, the problem is probe-rs resets only core0, not the entire chip. I've made an attempt to fix it here, but it's not trivial: https://github.com/probe-rs/probe-rs/pull/1603

Dirbaio avatar Aug 03 '23 11:08 Dirbaio

Is there any update on this? I have just begun working on a multicore program and keep hitting this issue. Always have to hold boot and hit reset for it to program and run correctly.

Verequies avatar Sep 26 '23 02:09 Verequies

the problem is probe-rs resets only core0, not the entire chip

I don't think fixing probe-rs the correct solution. If I'm reading the documentation and bootrom source code correctly, PIO is not reset by a watchdog reset, and let's not forget about boot loaders which might also be using PIO.

embassy-rp should not be trusting PIO to be in any kind of defined state when it starts execution. It should be resetting it very early during execution, or at least releasing Spinlock<31>

phire avatar Nov 16 '23 01:11 phire

Watchdog reset can definitely reset SIO if you set the bit in PSM.WDSEL. The embassy-rp implementation does set it.

Dirbaio avatar Nov 16 '23 03:11 Dirbaio

In my startup code, before doing any significant work (especially, before launching the second core), I’m resetting all of the spinlocks. Actually, I’m resetting most of the hardware via the reset registers. This makes live-debugging much nicer, though it does not strictly apply to embassy, it’s a general thing when working with RP2040 which doesn’t really get reset on debugger reset.

enbyted avatar Apr 14 '24 18:04 enbyted

@Dirbaio Is this "workaround" required for teleprobe as well?

plaes avatar May 25 '24 10:05 plaes

no, teleprobe has this workaround https://github.com/embassy-rs/teleprobe/blob/main/teleprobe/src/probe/mod.rs#L90-L127

which I was hoping to port into probe-rs with https://github.com/probe-rs/probe-rs/pull/1603 but doing it properly is hard.

Dirbaio avatar May 25 '24 11:05 Dirbaio