RIOT
RIOT copied to clipboard
OTA: tracking issue
Tracking OTA progress
Tracking issue for the different parts required to enable over the air updates.
Overview
We are aiming at a simple OTA prototype in a first phase, which can be extended in later phases. The blueprint for the first prototype is the implementation we had working at the SUIT Hackathon in June 2018 (see slide 3 of this recap). This combines a simplistic bootloader, 2 firmware slots and metadata that follows the SUIT specification. A basic update module, embarked in each image, manages firmware transfer (via CoAP), verification (complying with SUIT) and storage in Flash within the non-running image slot.
Work breakdown
Current state of development
@cladmi is addressing the tooling for the bootloader and metadata @kYc0o is addressing the bootloader @OYTIS is working on support for RIOT, including OTA, for KEA128LEDLIGHTRD board (https://github.com/RIOT-OS/RIOT/pull/9655)
See the GitHub Software Updates project for more details.
System diagrams
Both of the diagrams reflect the content of the working OTA update branch. This contains working examples for both a metadata-based update, and a SUIT manifest-based update. The areas not relevant to the example addressed in each section are greyed out.
Non-SUIT metadata based update mechanism
A description of the metadata is given here.
SUIT metadata based update mechanism
The process being followed is to merge the functionality required for OTA updates piecewise, instead of as a large monolithic PR.
Files for diagrams above
The diagrams were done using draw.io (https://www.draw.io/). Contact @danpetry for the project files.
Linker offset supports depends on ~https://github.com/RIOT-OS/RIOT/pull/9451~ to allow testing.
State of development
Please update this list as required so that we all know what we're up to
Development branches and repos
Here's a list of active branches and repos where people are developing. Please add your own with a short description if you can get around to it. https://github.com/kYc0o/RIOT/tree/hackathon_suit_2018 - Paco dev branch https://github.com/kYc0o/riot-ota-server/tree/master/otaserver - OTA server (Paco fork) https://github.com/aabadie/ota-server - OTA server (original) https://github.com/danpetry/RIOT/tree/ota/sandbox - Dan dev branch. Works with samr21-xpro along with a gen_and_flash.sh script which builds/flashes/generates manifest for use with a CoAP client/server for firmware distribution https://github.com/suit-wg/SUIT-repository-manager - CoAP client/server for firmware distribution https://github.com/suit-wg - suit working group. https://github.com/hannestschofenig/suit-manifest-generator - Manifest generator. fork of ARMmbed/suit-manifest-generator with a couple of additions.
Related issues not in the diagram above
Either because they are dependencies to the issues above, or some other reason. Basically here are some issues you might want to be aware of, and help along if you can. https://github.com/RIOT-OS/RIOT/pull/8902 - Original OTA PR ~https://github.com/RIOT-OS/RIOT/pull/9351~ - ~PR to allow partial ROM flashing for cortex m cpus~ EDIT (cladmi): to allow generating firmwares in part of the ROM for cortex-m
Current work being undertaken
Dan - working on RAPstore project on integration, so architecture can be informed by situational concerns Anton (OYTIS) - implementing periph/flasher module for a board (? which one)
For binflash I would also have a dependency of FLASHFILE
: https://github.com/RIOT-OS/RIOT/pull/8838
And I am thinking, why do we even need to do binflash
. Flashing a binary means having both a file and an address. And this could be done by generating an ihex
file instead.
And I am thinking, why do we even need to do binflash. Flashing a binary means having both a file and an address. And this could be done by generating an ihex file instead.
As far as my early tests went, it was very complicated to manage ihex files especially when you want to flash bootloader + image (slot 1 or 2). Then build an hex file with such information was very challenging at that time. To speed up things, and make something more reliable I used bin files flashed at the needed address.
Now we have all the information we need to know where to flash, so I'd say there's no problem using and flashing binary files while a bootloader takes the beginning of the flash.
For the bootloader, the mimimal set of required metadata is this:
/**
* @brief Structure to store firmware metadata
* @{
*/
typedef struct {
uint32_t magic_number; /**< metadata magic_number (always "RIOT") */
uint32_t version; /**< Integer representing firmware version */
uint32_t start_addr; /**< Start address in flash */
uint32_t chksum; /**< checksum of metadata */
} firmware_metadata_t;
/** @} */
- magic_number: To indicate that a firmware metadata field is located here.
- version: Version of this firmware. The bootloader will boot the firmware with the highest version number
- start_addr: Start address of the firmware. This way the bootloader can easily work with variable metadata sizes.
- chksum: Checksum of the metadata to check for a valid metadata (not firmware).
Documentation todo
- Dependencies:
- CPU peripherals required
- metadata format and bootloader interaction
- Metadata fields
- CPU initialization (happens twice)
Notes about flashbin
:
edbg
edbg
is already flashing binary files.
An offset from the start address can be given using -o|--offset
the PR currently uses this as an absolute address which is wrong in general. So flashing a binary from start address works.
Some work is only required to disable erasing everything and supporting flashing from an offset
of the start address. Which is only needed when flashing sub parts of the rom/invalidating a slot.
edbg.inc.mk: allow flashing with an offset in rom without erasing all ROM #9788
Openocd
Openocd supports flashing from binary as is, except that the flashing address must be given.
It uses offset
which is an offset from what is in the firmware, so offset from the link address for elf files/hex files, offset from 0
for binary files.
http://openocd.org/doc/html/Flash-Commands.html flash write_image
.
I plan to use the ROM_START_ADDR
in the openocd script to make flashing a binary work by default without external configuration. And use OFFSET
as an offset from this.
EDIT: The address cannot easily extracted from the build system, openocd.inc.mk
is included before cpu/.../Makefile.include
and having export ROM_START_ADDR
would give it a default empty value breaking the ROM_START_ADDR ?=
lines.
I queried this address from openocd.
openocd.sh: allow flashing binary files without configuration #9787
This would give a common behavior between boards using edbg and openocd.
Minutes from this week's OTA meeting (15.08.2018) available here
Which option is preferred to add comments/questions on the minutes? Should I comment via this issue, create a new issue, or ...?
I'd say: comment here?
I agree that it makes sense to use the existing Block1 functionality for now. Conceptually, some host sends firmware to the (RIOT) device, although I guess use of PUT/POST or GET depends on the higher-level application protocol. At any rate, the device is on the receiving end.
Also, "roadblock" is not a word I like to see associated with my name! I'm reviewing the RFC and Block2 PR (#8932), and plan to comment in the next day or so on moving forward.
@kb2ma actually Block2 has advantages in terms of security characteristics (since it is pull-based). If #8932 could be merged fast into master (before other CoAP reworks), it would improve the initial OTA solution we are going for, no doubt. That said, we plan OTA enhancements/reworks anyways, after an initial (basic) OTA mechanism is merged in the master branch. Hence, thanks a lot for your proposal to let us know about #8932, indeed ;-)
Both edbg
and openocd
binflash have been merged. The branch will need to be rebased with the changes. @kYc0o we can spend some time doing this and testing this afternoon if you want.
It currently still requires setting two different variables to handle both openocd and edbg to flash a binary file but will be addressed by https://github.com/RIOT-OS/RIOT/pull/8838.
Another dependency I remember as I am rebasing the branch, is process dependencies for CPU and move stm32_common
periph_flashpage
dependencies to Makefile.dep
I could easily provide a test for this one even if the test does not need to be merged.
Finally I got to test the current state with a rebased branch with most recent changes in master related to OTA. The most updated branch is at [1].
Two problems:
- Compile for slot 2 doesn't seem to work with the current verification for slot sizes:
Specified firmware size does not fit in ROM
. However, by bypassing the assert in cortexm.ld it compiles and actually is linked correctly, but: - For edbg this line is not working
EDBG_ARGS += $(addprefix --offset ,$(IMAGE_OFFSET))
:
/Users/facosta/git/RIOT-OS/RIOT/dist/tools/edbg/edbg --offset $((0x1000 --offset + --offset 0x1f800)) -t atmel_cm0p -b -v -p -f /Users/facosta/git/RIOT-OS/RIOT/examples/suit_updater/bin/samr21-xpro/suit_updater-slot2.signed.bin
bash: 0x1000 --offset + --offset 0x1f800: syntax error in expression (error token is "--offset + --offset 0x1f800")
make: *** [/Users/facosta/git/RIOT-OS/RIOT/makefiles/riotboot.mk:112: riotboot/flash-slot2] Error 1
[1] https://github.com/kYc0o/RIOT/tree/wip/rebase/ota_work_branch
For edbg
I did not think about the value with spaces so 'addprefix' does not work in that case.
Two solutions:
- Replace
IMAGE_OFFSET=$((0x1000+0x1f800))
without spaces - Replace the EDBG_ARGS line with
$(if $(IMAGE_OFFSET),--offset $(IMAGE_OFFSET)))
The first one can be done directly on our side to make it work, and upstream the proper fix to RIOT after.
I found the issue for the offset on slot 2, the assert was right. In makefiles/riotboot.mk
I wrote
FW_ROM_LEN=$(RIOTBOOT_FW_SLOT_SIZE)
when the slot size includes both the metadata and the firmware, so should be
FW_ROM_LEN=$(RIOTBOOT_FW_SLOT_SIZE - $(RIOTBOOT_HDR_LEN))
And then the ldscript changes can be reverted.
We are not consistent with the names yet. If slot
means metadata + firmware, the generated files should be adapted.
I also have some issues when building the bootloader on my laptop as git am
, which applies patches to packages, tries to use HOME
to find git configuration and extract email and author name but we do not pass it down for building.
For me it is an issue that 'git am` tries to use local configuration to do the commit. I want to fix this for some time but did not have real visible problems to show.
Just as a side note I made while playing around with @kYc0o branch, if the periph_gpio, periph_uart and uart_stdio are disabled (I had to comment out some initialization functions), the bootloader takes 1428B on the samr21-xpro. Original was 2920B. I didn't test whether it was still functioning, but I don't see a reason why the bootloader would need those peripherals. Chopping the flash requirement in half seems to be the more beneficial trade off here.
On a practical side, this requires making the initialization of the gpio and (stdio_)uart optional. While not impossible, this would add another todo on the already large list. I just want this to be noted down somewhere so I don't forget about it.
Would people possibly want the stdio uart for debugging?
this would add another todo on the already large list.
I vote for getting the bootloader merged without this, and then optimizing more for size later. For this reason.
Would people possibly want the stdio uart for debugging?
Probably, but if the bootloader is functioning, it should not be necessary. Furthermore, if your device is somewhere in the wall, the UART might not be accessible and kinda useless to include in the config.
I vote for getting the bootloader merged without this, and then optimizing more for size later. For this reason.
That's exactly what I would also prefer here, just want to have it somewhere written down :smile:
Additional note, with the current implementation using two slots, saving space in the bootloader is only useful if you save two pages, as both slots have the same size.
For the samr21-xpro
with a page size of 256
you save ROM, with the iotlab-m3
the page size is 2048
so there would be nothing gained here, with the current implementation.
Proper fix for edbg issue with spaces in IMAGE_OFFSET
: https://github.com/RIOT-OS/RIOT/pull/9956
Proper fix for edbg issue with spaces in IMAGE_OFFSET: #9956
Thanks @cladmi ! It is merged. Do you mind to rebase again? It should be easier now.
@kYc0o I will do it and remove non needed commits, I will still keep the last fixups for the moment.
Rebased, I checked the rebased diff against the previous one.
Now I can do make riotboot/flash-bootloader
, make riotboot/flash-slot1
and make riotboot/flash-slot2
for samr21-xpro
without modifications.
I also included changes to make it work on iotlab-m3
using openocd.sh
. IMAGE_OFFSET
and IMAGE_FILE
should be exported to be visible in the script.
And fixed for iotlab-m3
flash-slot2
there are issues with $(( ))
and export
.
I also think that flash-slot1
and flash-slot2
should NOT flash the bootloader.
Cpu_init idempotency test
We wanted to implement a test which re-ran cpu_init(), as a unit test which would have helped to isolate bugs for people wanting to port the bootloader to a new board, and would have avoided them attempting this if the test failed. This was intended as a necessary but not sufficient test for the bootloader to be able to load working firmware.
It was found that for the SAM R21, this gave a false negative as such a test. This is because the newlib initialization is run after cpu_init(), and here the state is set correctly of the generic clock register that clocks the stdio USART. This means that no [SUCCESS]
assert printout is received by the testing machine if only cpu_init() (which sets these registers into a default, incorrect state) is run.
In general, the startup sequences are, currently, quite heterogeneous. This means that the startup sequence can't really be broken down into testable units. Although such tests are important to help people to isolate bugs more quickly - we have already had issues in this area - here it would require refactoring of the startup code for all different boards to a common format, which would be a considerable amount of work. This work is also considered non-blocking to the implementation of a bootloader.
I've raised an issue (https://github.com/RIOT-OS/RIOT/issues/9964) for this work, which can continue in parallel with the initial implementation of a bootloader.