rp-hal
rp-hal copied to clipboard
Multicore Lockout
I was experimenting with flash writing using rp2040-flash and littlefs2. Everything was running smoothly until I wanted to try multicore with it.
As I understand it is because both cores run directly from the flash (unless you use #[link_section = ".data.ram_func"]
) and as one core litteraly unplug the flash the other instantly crash and cannot recover when the flash is plugged back.
To solve this issue, there is these functions: multicore_lockout provided by the C SDK, that can pause the other core in a safe mode until the lock is unlock (like a SpinLock but for the entire core). However, I cannot find any implementation of this in the crate, neither any pratical way to solve the problem using other ways.
I dig a bit and wanted to try implementing a SIO interrupt like they did with the lockout function. For that, on core 0 I did:
vt.init(&mut pac.PPB);
vt.register_handler(pac::Interrupt::SIO_IRQ_PROC0 as usize, test);
unsafe {
vt.activate(&mut pac.PPB);
pac::NVIC::unmask(pac::Interrupt::SIO_IRQ_PROC0);
}
Here is my function test:
#[inline(always)]
#[link_section = ".data.ram_func"]
extern "C" fn test() {
...
}
The function is triggered, however it seems like the core never come back from it and the execution stop. (I removed all flash function for my tests)
It's not obvious to me what's going wrong here. #[inline(always)]
in combination with a ram function looks strange, but I don't think that's the issue here as when it's called from an interrupt, it can't be inlined anyway.
Can you publish a small example reproducing the issue into a git repo? Perhaps I can see more looking at the full code.
Hey,
Thanks, for looking at this issue! I'm going to write a small example to isolate the behavior I'm trying to achieve so hopefully it will makes more sense to you.
I've done some test and almost figure out how to achieve what I wanted. Basically, the problem with the interrupts occurs when I was trying to use fancy calls inside the interrupt (like rebuild builtin WS2812 object for debugging) but I think it was conflicting with object inside the main function causing corruption and unwanted behavior in the main function (basically crashing it when the interrupt returned). Now that I've a cleaner interrupt with only required code inside it seems to work. The only problem left is that even if the interrupt function runs from RAM when interacting with the flash using core1, core0 seems to have a hard time recovering from the interrupt.
I'll send you the repos link with the code soon. (it will makes more sense)
Here is the example code: https://github.com/Captainfl4me/rust-2-rp/blob/master/multicore-lockout-irq/src/main.rs
While I do not yet know what exactly causes your code to fail, some ideas where I'd investigate further:
- your
write_flash
function calls several other functions. Why the function itself is in RAM, there's no guarantee these function calls don't use flash. While functions likerom_data::flash_flush_cache();
in the end call some code in ROM, there are some intermediate steps insiderp2040_hal
that may or may not be inlined into your RAM function. In general, it's difficult to guarantee that some rust code is entirely located in RAM and doesn't call any code in flash. Because of this, in https://github.com/jannic/rp2040-flash/blob/master/src/lib.rs#L174, these calls are manually ported to assembly. - The same is true for
fifo_wait_in_ram
. I'm not even sure if thelink_section
applies to the closure inside the critical section. Here, I'd put the contents of the critical section into a separate function, put that function into RAM, and manually verify that it's compiled down to some simple code that's not calling any external function.
Uh, even worse, it looks like rustc doesn't inline the sio read
/read_blocking
calls:
20000000: b5d0 push {r4, r6, r7, lr}
20000002: af02 add r7, sp, #0x8
20000004: b082 sub sp, #0x8
20000006: f000 f827 bl 0x20000058 <__Thumbv6MABSLongThunk__critical_section_1_0_acquire> @ imm = #0x4e
2000000a: 4604 mov r4, r0
2000000c: 480a ldr r0, [pc, #0x28] @ 0x20000038 <__sdata+0x38>
2000000e: 2101 movs r1, #0x1
20000010: 7001 strb r1, [r0]
20000012: a801 add r0, sp, #0x4
20000014: f000 f826 bl 0x20000064 <__Thumbv6MABSLongThunk_rp2040_hal::sio::SioFifo::read::h41c4759b5756f495> @ imm = #0x4c
20000018: 2801 cmp r0, #0x1
2000001a: d108 bne 0x2000002e <__sdata+0x2e> @ imm = #0x10
2000001c: 2969 cmp r1, #0x69
2000001e: d106 bne 0x2000002e <__sdata+0x2e> @ imm = #0xc
20000020: a801 add r0, sp, #0x4
20000022: f000 f825 bl 0x20000070 <__Thumbv6MABSLongThunk_rp2040_hal::sio::SioFifo::read_blocking::he8005b2c0e2eed67> @ imm = #0x4a
20000026: 2121 movs r1, #0x21
20000028: 0249 lsls r1, r1, #0x9
2000002a: 4288 cmp r0, r1
2000002c: d1f8 bne 0x20000020 <__sdata+0x20> @ imm = #-0x10
2000002e: 4620 mov r0, r4
20000030: f000 f824 bl 0x2000007c <__Thumbv6MABSLongThunk__critical_section_1_0_release> @ imm = #0x48
20000034: b002 add sp, #0x8
20000036: bdd0 pop {r4, r6, r7, pc}
20000038: ee 40 00 20 .word 0x200040ee
So your fifo_wait_in_ram
is actually jumping to flash, which won't work if your flash gets disabled.
(Generated with cargo objdump --release -- -d
)
Thank you so much for the insights! I did know about objdump I think it will be very useful to better understand what is actually going on.
For the write_flash function I replicated the code from rp2040_flash library but without the writing parts because I didn't think it was actually needed. I just use the Hal functions but as you say maybe that's also causing issue.
I'll take a look and let you know!
Well you're right after a close inspection of the dump file it seems like all the function calls (rom_data::
, critical_section_acquire
, critical_section_release
, read
, read_blocking
) actually live in ROM (section .text) therefore causing a jump to the ROM during interrupt.
Not sure if there is a way to force rust to inline the functions ? So I guess I'll try writing my RAM function in asm directly so that it is inline (like they did with rp2040_flash)
I refactor the code to have the following (I push the changes if you want to take a look on the repos),
#[inline(never)]
#[link_section = ".data.ram_func"]
extern "C" fn fifo_wait_in_ram() {
// critical_section::with(|_| {
if fifo_is_read_ready() {
if fifo_read() != LOCKOUT_CORE {
return;
}
loop {
if fifo_is_read_ready() && fifo_read() == UNLOCKOUT_CORE {
break;
}
}
}
// Drain the FIFO before returning
while fifo_is_read_ready() {
let _ = fifo_read();
}
// });
}
#[link_section = ".data.ram_func"]
pub fn fifo_is_read_ready() -> bool {
let sio = unsafe { &(*pac::SIO::ptr()) };
sio.fifo_st.read().vld().bit_is_set()
}
#[link_section = ".data.ram_func"]
pub fn fifo_read() -> u32 {
let sio = unsafe { &(*pac::SIO::ptr()) };
sio.fifo_rd.read().bits()
}
According to the objdump, there is no more call to functions living in ROM. (I also update the core1 to use rp2040_flash so that I don't have to call rom_data
). However, even if, without the flash logic (if I remove the flash_range_program
) the interrupt is working, when I add the flash_range_program
the program freeze after triggering the interrupt (not sure when, if it is during or after)
Okay my bad it was just a problem with the way I was passing the buffer to the flash_range_program
function. Now, everything seems to work!!
Code here
Glad to hear that it works now, and thanks for posting the working code!