teensy4-rs icon indicating copy to clipboard operation
teensy4-rs copied to clipboard

Teensy 4.1: extra flash, optional external RAM not addressable

Open mciantyre opened this issue 3 years ago • 12 comments

The teensy4-bsp supports both Teensy 4.0 and 4.1 boards. We achieve this with a single linker script. However, the common support means that we are not using the Teensy 4.1's

  • larger flash
  • (optional) external RAM
  • (optional) extra flash

Users who need more than ~2MB flash for their Teensy 4.1 programs, or who want to use the pads for extra RAM and flash, may find that today's BSP doesn't support these features.

This issue tracks support for extra storage on the Teensy 4.1.

mciantyre avatar Nov 30 '20 23:11 mciantyre

Note that this issue doesn't affect usage of the SD card on the Teensy 4.1. I believe using the SD card will need a uSDHC driver.

mciantyre avatar Jul 11 '21 02:07 mciantyre

How would one go one to enable this? I'm actually running into hard memory limits whilst speaking on the Teensy as I had assumed to have at least 1mb of RAM available for the Heap implementation.

tyalie avatar Dec 12 '21 10:12 tyalie

If you're interested in hardware modifications that increase RAM, the Teensy 4.1 supports external RAM. Here are the recommended parts and installation instructions. Otherwise, the on-chip RAM (OCRAM) is the same for both Teensy 4 models.

I had assumed to have at least 1mb of RAM available for the Heap implementation.

Sorry, but I'm not sure we can provide 1MB of RAM solely for heap. The official Teensy 4 runtime provides up to 512KB of OCRAM for the heap. See the Teensy 4.1's memory map, here. #110 proposes something similar for this implementation.

mciantyre avatar Dec 14 '21 22:12 mciantyre

Ah thank you. Now I understand the title better. Sorry for posting it in the wrong thread. I assumed it was different to the Twensy 4.0 as the flash has increased immensely in comparison.

tyalie avatar Dec 14 '21 22:12 tyalie

@mciantyre Should it suffice to replace this line

RuntimeBuilder::from_flexspi(Family::Imxrt1060, 1984 * 1024)

with

RuntimeBuilder::from_flexspi(Family::Imxrt1060, 16384 * 1024)

for the Teensy MicroMod?

(EDIT: I suppose the FlexSPI Configuration Block in teensy4-fcb would also need to be modified to reflect the different flash_size?)

Also, since we're also talking about RAM here...

I got my FlexIO+DMA LCD driver working, and implemented a driver for Slint, and it works:

teensy-micromod-slint

But that's just running the small demo here: https://github.com/slint-ui/slint-mcu-rust-template/blob/225ab463511411ffb4b26f71b77e2215717e8667/ui/appwindow.slint

I wanted to try running their printer UI demo, but I run into linking issues:

error: linking with `rust-lld` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/cstrahan/.rustup/toolchains/nightly-2023-02-13-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/cstrahan/.deno/bin:/run/user/1000/fnm_multishells/15290_1676178469019/bin:/home/cstrahan/.fnm:/home/cstrahan/.local/share/pnpm:/home/cstrahan/.asdf/shims:/home/cstrahan/.asdf/bin:/home/cstrahan/.local/bin:/usr/local/bin:/run/user/1000/fnm_multishells/6622_1676176825985/bin:/home/cstrahan/.cargo/bin:/usr/local/sbin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/cstrahan/go/bin:/usr/local/go/bin:/home/cstrahan/.fzf/bin" VSLANG="1033" "rust-lld" "-flavor" "gnu" "/tmp/rustcWpV0mK/symbols.o" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/deps/hello_teensy-b04e3cee815147c1.hello_teensy.09139a8b-cgu.0.rcgu.o" "--as-needed" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/deps" "-L" "/home/cstrahan/src/hello-teensy/target/release/deps" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/build/cortex-m-090e74d85d092cc7/out" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/build/cortex-m-rt-def1062c5ff082b2/out" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/build/defmt-475418a5914e0621/out" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/build/imxrt-ral-72f39907aa255519/out" "-L" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/build/teensy4-bsp-2c6edf65dc997284/out" "-L" "/home/cstrahan/.rustup/toolchains/nightly-2023-02-13-x86_64-unknown-linux-gnu/lib/rustlib/thumbv7em-none-eabihf/lib" "-Bstatic" "/tmp/rustcWpV0mK/libcortex_m-51776dd45413cd5e.rlib" "/home/cstrahan/.rustup/toolchains/nightly-2023-02-13-x86_64-unknown-linux-gnu/lib/rustlib/thumbv7em-none-eabihf/lib/libcompiler_builtins-cb44e1eeaeda502e.rlib" "-Bdynamic" "--eh-frame-hdr" "-znoexecstack" "-L" "/home/cstrahan/.rustup/toolchains/nightly-2023-02-13-x86_64-unknown-linux-gnu/lib/rustlib/thumbv7em-none-eabihf/lib" "-o" "/home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/deps/hello_teensy-b04e3cee815147c1" "--gc-sections" "-Tt4link.x" "-Tdefmt.x"
  = note: rust-lld: warning: section type mismatch for .uninit.defmt-rtt.BUFFER
          >>> /home/cstrahan/src/hello-teensy/target/thumbv7em-none-eabihf/release/deps/hello_teensy-b04e3cee815147c1.hello_teensy.09139a8b-cgu.0.rcgu.o:(.uninit.defmt-rtt.BUFFER): SHT_PROGBITS
          >>> output section .uninit: SHT_NOBITS
          
          rust-lld: warning: section type mismatch for .got
          >>> <internal>:(.got): SHT_PROGBITS
          >>> output section .got: SHT_NOBITS
          
          rust-lld: warning: section type mismatch for .got.plt
          >>> <internal>:(.got.plt): SHT_PROGBITS
          >>> output section .got: SHT_NOBITS
          
          rust-lld: warning: section type mismatch for .got
          >>> <internal>:(.got): SHT_PROGBITS
          >>> output section .got: SHT_NOBITS
          
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 220 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7408 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7484 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7488 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7492 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7494 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7500 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7502 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7776 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7860 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7874 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7932 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 7940 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 8188 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 8216 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 9828 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 9900 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 9952 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 10004 bytes
          rust-lld: error: section '.text' will not fit in region 'ITCM': overflowed by 10044 bytes
          rust-lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
          

warning: `hello-teensy` (bin "hello-teensy") generated 38 warnings
error: could not compile `hello-teensy` due to previous error; 38 warnings emitted

Tried adding this to Cargo.toml:

[profile.release]
opt-level = "z" 
lto = true

but still no luck. Any advice?

cstrahan avatar Feb 13 '23 21:02 cstrahan

They run the demo on a STM32H735IGK6U MCU (564kB of SRAM) and a Raspberry Pi Pico (264kB SRAM), so it seems like there should be some way to pull this off on the TMM. Certainly they aren't doing something like keeping code on flash instead of SRAM :thinking:

I figured I'd do something like cargo nm --release -- --print-size --size-sort | grep ' \(t\|T\) ' to see the worst offenders, but I can't do that if the whole thing fails to link :sweat_smile:.

cstrahan avatar Feb 13 '23 22:02 cstrahan

diff --git a/build.rs b/build.rs
index 201ff96..3f74126 100644
--- a/build.rs
+++ b/build.rs
@@ -13,7 +13,7 @@ fn main() {
         .stack(Memory::Dtcm)
         .stack_size(16 * 1024)
         .vectors(Memory::Dtcm)
-        .text(Memory::Itcm)
+        .text(Memory::Flash)
         .data(Memory::Dtcm)
         .bss(Memory::Dtcm)
         .uninit(Memory::Ocram)

This lets me compile and link without error, but flashing the device doesn't give any indication that anything is happening (no RTT output, and nothing on screen). Dunno if I'm either missing something there, or if the RT1062 needs to be configured differently to run from flash, or something else.

cstrahan avatar Feb 13 '23 23:02 cstrahan

Well done 🎉 very cool to see a UI framework running on these MCUs!

Yup, I'd expect changes to the flash size in the RuntimeBuilder::from_flexspi call, and also the FCB. However, I'm not expecting this to be necessary until we see a linker warning indicate we're running out of FLASH. (The previous linker error indicates we're out of ITCM; increasing the flash size won't fix that.)


From the unmodified runtime:

https://github.com/mciantyre/teensy4-rs/blob/6f3833471a1f214e0415c7cde9dac82c1e360607/build.rs#L6-L10

Every FlexRAM bank is 32 KiB of some kind of RAM. If you add a bank to ITCM, you'll get an additional 32 KiB for instructions. But, the chip only has 16 banks. So if you add a bank to ITCM, you'll need to take a bank away from DTCM.

I'm hoping that there's a balance of FlexRAM banks where those "section '.text' will not fit in region 'ITCM'" errors go away.


An extreme alternate approach: How about putting everything into OCRAM? We can express that with the RuntimeBuilder. This should be the way to give ourself 1024 KiB of contiguous RAM.

RuntimeBuilder::from_flexspi(Family::Imxrt1060, 1984 * 1024)
    .flexram_banks(FlexRamBanks {
        ocram: 16,
        itcm: 0,
        dtcm: 0,
    })
    .heap(Memory::Ocram)
    .heap_size(16 * 1024)
    .stack(Memory::Ocram)
    .stack_size(16 * 1024)
    .vectors(Memory::Ocram)
    .text(Memory::Ocram)
    .data(Memory::Ocram)
    .rodata(Memory::Ocram)
    .bss(Memory::Ocram)
    .uninit(Memory::Ocram)
    .linker_script_name("t4link.x")
    .build()
    .unwrap();

Naively, I expect this XIP configuration would work. But, I haven't played around enough with XIP to know its proper setup and limitations. (If I remember correctly, whenever I tried XIP on my 1010EVK, I'd eventually fault with undefined instructions.)

mciantyre avatar Feb 13 '23 23:02 mciantyre

Thanks @mciantyre for your help thus far!

I tried again with the XIP config, but using the simple demo UI, and that did work.

So I decided to put aside the big printer UI demo and see if I could reproduce my problems by just duplicating the text and button field on that UI; sure enough, past a certain number of elements (~10 text boxes or buttons) the program crashes again.

Attached a debugger and found that when it does fail, it happens right when making a function call. Execution jumps straight to the HardFault handler. So my hunch is something like I can't call a function if the address is too high.

My OOM handler isn't called, so it's not that (and besides, this is really early on in the initialization of the program -- not much has or will be allocated at point).

In case it was a problem with XIP, I also tried with this config (only slightly modified from your suggestion):

#[cfg(feature = "rt")]
fn main() {
    use imxrt_rt::{Family, FlexRamBanks, Memory, RuntimeBuilder};

    RuntimeBuilder::from_flexspi(Family::Imxrt1060, 1984 * 1024)
        .flexram_banks(FlexRamBanks {
            ocram: 16,
            itcm: 0,
            dtcm: 0,
        })
        .heap(Memory::Ocram)
        .heap_size(16 * 1024)
        .stack(Memory::Ocram)
        .stack_size(16 * 1024)
        .vectors(Memory::Ocram)
        .text(Memory::Ocram)
        .rodata(Memory::Flash) // <--- rodata still won't fit, so put it in flash
        .data(Memory::Ocram)
        .bss(Memory::Ocram)
        .uninit(Memory::Ocram)
        .linker_script_name("t4link.x")
        .build()
        .unwrap();
}

I feel like it's sooo close!

If you have any other suggestions, please do let me know. Thanks!

cstrahan avatar Feb 14 '23 03:02 cstrahan

Realized it could be a stack overflow, so I tried doubling up on stack (.stack_size(32 * 1024)) and what wasn't working is working now using the last config I mentioned!

So my next order of business is figuring out how I can get a heads-up when the stack overflows, because that's super not fun to debug.

cstrahan avatar Feb 14 '23 03:02 cstrahan

Awesome! Good call putting read-only data into flash. I've generally had better luck fetching data over FlexSPI than fetching instructions over FlexSPI.


One stack overflow detection strategy that comes to mind is to use the MPU. A MemManage fault could be that heads-up. But for the MPU to be effective, I think we'd need to define all of the accessible memory regions.

I'd been thinking about exposing all the memory regions through imxrt-rt for this purpose. The crate already has APIs for the heap start and end; it could also make available the endpoints for stack, text, data, uninit, etc. This might let users implement their own MPU policy. We could also provide a default policy in teensy4-bsp or imxrt-rt.

(My suggestion ignores a potential problem of "what does a helpful MemManage / HardFault response look like when we're out of stack?")

mciantyre avatar Feb 14 '23 21:02 mciantyre

@mciantyre I'll have to read up on the MPU -- thanks for the suggestion!

Also, since you might be interested, a video of me running their printer demo UI: https://twitter.com/charlesstrahan/status/1630026224356474881

Just recently wrapped up the last pieces I needed to get everything working :). Hoping to soon tease apart the 8080 bus, Slint backend, Goodix touchscreen and RA8876 (and ILI9486) controller code into separate crates and opensource everything. After that, I'm hoping to design a PCB to facilitate direct connection to this display (and maybe a couple others), so that someone wanting to write fancy (and responsive) UIs for embedded projects can hit the ground running from day 1.

cstrahan avatar Feb 27 '23 03:02 cstrahan