espflash
espflash copied to clipboard
[esp32c6] espflash write-bin converts the ELF image to a `.bin` which has the DROM/IROM segment containing `esp_app_desc_t` at a wrong position
There is a need for some black magic in the code that converts an ELF to a .bin file so as to make sure, that the segment containing the esp_app_desc_t structure is placed as the first segment in the the resulting .bin file.
The correct placement of this struct however fails on the esp32c6 (and I suspect other newer MCUs as well). More info here: https://github.com/esp-rs/esp-idf-sys/issues/365#issuecomment-2624123925
The above was always necessary so that one can easily locate the position of the esp_app_desc_t structure during OTA update and abort the OTA update "in-flight" if the image description describes an older (or in other ways inappropriate) image for flashing.
The problem is however now much more severe with the recent breed of new ESP IDF bootloaders (ESP-IDF 5.2.3 and ESP-IDF 5.3.2) which themselves do expect the esp_app_desc_t to be at the beginning of the firmware image (offset 32 if I'm not mistaken) or else they read bogus values and often refuse to boot the firmware image.
Possible reasons why the esp_app_desc_t is no longer placed at the beginning of the binary image is explained in details in the above esp-idf-sys bug and has a lot to do with the fact that esp32c6 and newer MCUs no longer distinguish between "DROM" and "IROM" and moreover, the "DROM" segment (if there is still such a thing after all) is having an address which is after the IROM segment (unlike all earlier chips, where exactly the opposite is true).
I suspect proper placement of esp_app_desc_t was anyway only working as a pure luck in espflash simply due to the fact that probably XMAS ELF returns the ELF segments by address order and then naturally the "DROM" segment was placed first in the resulting .bin - for earlier chips.
FYI, esptool is suffering exactly the same bug, but ESP-IDF C code is apparently unaffected, as the ESP IDF native C build apparently uses "something else" to convert the ELF to .bin...
FYI guys, here is the problem and proposed solutions described.
I'd like to clarify the exact issue, because write-bin does not convert the elf in any way. Is the issue that espflash save-image converts the image incorrectly, that espflash flash flashes to the device incorrectly, or both?
I'd like to clarify the exact issue, because
write-bindoes not convert the elf in any way.
Sure I actually meant save-image, not write-bin.
Is the issue that
espflash save-imageconverts the image incorrectly, thatespflash flashflashes to the device incorrectly, or both?
Both. In that the code used to convert the elf to ".bin" (segments) is also used by espflash flash.
In esptool the problem was (supposedly) fixed by https://github.com/espressif/esptool/commit/f4fabc5de45942f96a952ef084aed6d26e093438 - but that looks a bit more complicated than "grab the misplaced segment and swap it first". I can look into porting that change to espflash, but it would be helpful if someone could tell me it indeed fixes the issue in esptool.
In esptool the problem was (supposedly) fixed by espressif/esptool@f4fabc5 - but that looks a bit more complicated than "grab the misplaced segment and swap it first". I can look into porting that change to espflash, but it would be helpful if someone could tell me it indeed fixes the issue in esptool.
Not exactly. In the end it turned out that there is actually no bug in esptool - however - one has to really pass the correct MMU page size for the concrete chip when invoking it. Because otherwise it always uses 64KB for MMU page size, which leads to mis-aligned segments on the esp32c6 (which by default uses 32KB in the esp-idf ld script, but this is changeable with a config option).
So perhaps the first thing to figure out is
- is there a way to specify the MMU page size to
espflash? - if not, perhaps there should be, so that the user can specify e.g. 32KB for esp32c6 (and others)
- Then use that MMU page size instead of (possibly hardcoded 64KB) and see if everything is aligned properly (because - to repeat myself - the only problem in
esptoolwas that I was not passing explicitly the correct MMU page size and that's why it was misaligning; there was no other bug)
And maybe only after all of the above, figure out from where to get the "proper" MMU page size. The PR you've linked does a bit of a black magic by reading the MMU page size from the esp_app_desc_t structure itself, as there is "no better way". I'm not 100% sure whether that's the best possible approach, but my point is, we first have to verify - if espflash has the correct MMU page size for the app image in the first place - whether it does a proper job or not.
Or to put it in another way - the linked PR only does the "extract the MMU page size that was used for that concrete app image (or rather, in its ld script) as the user did not specify anything on the command line w.r.t. MMU page size". Once esptool has the correct MMU page size knowledge, it actually works fine. I'm not sure espflash is in the same shape yet - i.e. if there is a way to pass to it the correct MMU page size (which there isn't) its internals would use that and do proper aligning.
Thanks, that's a good summary of a lengthy mess of confusing information. To reproduce this, is it enough for me to create a hello world for the c6 with a recent enough esp-idf, or do I need a specific elf hidden somewhere?
Thanks, that's a good summary of a lengthy mess of confusing information. To reproduce this, is it enough for me to create a hello world for the c6 with a recent enough esp-idf, or do I need a specific elf hidden somewhere?
That should be good enough, yes. Just double-check that the esp-idf (if you use pure C hello world) contains the esp_app_desc_t thing so that you can have an easy way of detecting if the generated .bin file is mis-aligned when looking at the HEX-bytes of the .bin file in the Matrix.
@bugadani feel free to ask for details. The best case is to pass the mmu_page_size parameter always, but this was not possible for esptool as it is a breaking change causing issues for ESP-IDF too. Since ESP-IDF 5.4 the parameter is present in esp_app_desc_t, so it can be determined from there. If it is not there, it can also be 'guessed' from the esp_app_desc_t address alignment as the linker script should ensure that the segment should be after image and segment header. This is not ideal as the alignment can be coincidentally aligned to 64K and 32K for example, but alignment to 64K when the correct one is 32K should not be an issue for most of the applications. Also warning is printed in this case. This is not ideal as the best would be to require the mmu_page_size for the chips with variable MMU page size, but this is the most I can do on esptool side without enforcing the parameter I believe.
@bugadani and @jessebraham - I found this issue only hours before you merged the fix, but thank you for fixing it. I'd spent days trying to figure out why the firmware I'd built and flashed to an esp32c6 board was boot looping with messages like:
I (91) boot: End of partition table
I (95) boot: No factory image, trying OTA 0
E (98) esp_image: Failed to fetch app description header!
E (104) boot: OTA app partition slot 0 is not bootable
E (108) esp_image: image at 0x400000 has invalid magic byte (nothing flashed here?)
E (116) boot: OTA app partition slot 1 is not bootable
E (121) boot: No bootable app partitions in the partition table
After finding this issue and building cargo-espflash from the fix commit, I can flash the firmware and the board boots correctly. Thanks again!