tock
tock copied to clipboard
OTA app project
Pull Request Overview
[2022-07-02] First prototype
[2022-07-23] Change Log
- Uart version ota app is implemented.
- At current implementation, ota app provides only loading a new app (not erase)
- Applied feedbacks I received, and I made a couple of improvements in terms of functionality.
- Added 'ota_uart.py' tools. With this tool, a new application can be updated via UART at runtime. Please note that the size of the new application has to be smaller or equal than the size of ota app because of MPU alignment rule.
- 'ota app' keep track the dynamically changing start address of flash and sram memory after loading applications by tockloader and ota app. By doing so, the update procedure by 'ota app' doesn't interfere with the memory region that is occupied by kernel and other apps.
- I left nonvolatile_storage_driver at main.rs. To load an application with 'ota app', it is a necessary component. Also, the addresses of kernel and app flash memory are independent on the size of 'ota app'. It means that the addresses are the fixed specification of microbit_v2 platform. We don't need to change the addresses according to the size of 'ota app'
[2022-08-02] Change Log
- Added a new feature that finds a start address of flash satisfying MPU rules. Now we can load 3 applications by OTA app. For simplicity, I didn't consider MPU subregion rules.
- Since tockloader adds 512 bytes of 01 padding from the end of an app, tockloader should not be used together with OTA app after loading an app with OTA app. I also deleted writing that padding bytes from 'ota_uart.py' tool
- Whether there is enough flash region satisfying MPU rules and index to save a new app is transferred into `process_load_utilities'
- After finding a start address based on MPU rules, we check whether or not the new region for new app invades other regions already occupied by other apps as fail-safety.
- Please refer to [2022-08-02] section of 'OTA_app_system_documnet.md'
[2022-08-11] Change Log
- Added a new function to insert padding apps. So, the original process_load_advanced function can successfully load the loaded apps from OTA app
- I also check CRC32 consistency of padding apps
- Added a few commands
[2022-08-14] Change Log
- Solved two issues caused by phantom apps after
tockloader erase-apps
- Fixed bug in check_overlap_region function
- Organize code
[2022-08-15] Change Log
- Added validation check of TBF base header
- The header length isn't greater than the entire app
- The header length is at least as large as the v2 required header (which is 16 bytes)
- Check Base Header Checksum consistency
- Check consistency between the requested app size and the actual app size in TBF header
- Added a security feature.
- Attack Scenario: A malicious ota app is installed via OTA app, and it deletes (0xff) all of the flash region by using
nonvolatile_storage_driver
. - Result: Although the malicious ota app manipulate the regions unoccupied by the existing apps, it doesn't have to invade the other regions occupied by the existing apps.
Testing Strategy
[2022-07-02] First prototype. It is not necessary to test.
[2022-07-23] Change Log
- For demo, please refer to the guide section of 'OTA_app_system_documnet.md'.
- Test cases in terms of functionalities.
- ota app + 3 apps (by ota app) -> tockloader erase-apps -> load ota app again -> load an apps (by ota app) : Success
- ota app + 2 apps (by ota app) -> push reset button -> load 1 app: Success
- A big size app + ota app (positioned at index 1) -> load an app (by ota app) : Success
- ota app + 1 app (by tockloader) -> load a new app (by ota app): Success
- ota app -> load an app with crc fail -> erase the loaded app: Success
- ota all -> load a big size app which doesn't follow MPU alighnemt rule -> erase the loaded app and do not load the entry point of the app: Success
- Checked the dynamically changing flash and sram start address by printing out that values
[2022-08-02] Change Log
- I'm testing about 800 combination of app bundles (256k - 512 byte) loaded by OTA app. As soon as I finish the test, I will load the result [2022-08-04 Done and Pass]
[2022-08-11] Change Log
- After loading apps, check the app bundles with
tockloader list --verbose
. If the app bundles are loaded successfully, tockloader can read the loaded apps successfully. [2022-08-11 Done and Pass] - Please refer to
Alignment_Test.xlsx
in /doc/OTA_app
[2022-08-14] Change Log
- Tested two issues caused by phantom apps after
tockloader erase-apps
[2022-08-15] Change Log
- Tested the above attack scenario
- Result: No corruption
- Tested TBF base header validation check
TODO or Help Wanted
[2022-07-02] First prototype. [2022-07-23]
- Need to modify make file and elf2tab to add permission header to ota_app.tab for security [2022-08-02 Solved]
- Need to come up with an idea satisfying the MPU alignment rule, when loading application by ota app [2022-08-02 Solved]
- Try to erase and update function as future work
[2022-08-02]
- After loading apps with OTA app, when I push the reset button, the original 'load_process_advanced' cannot load the sparsely located apps. I need to find a way to parse the apps. [2022-08-11 Solved]
[2022-08-11]
-
Is it worthy to improve finding a start address based on MPU rules including subregion? (Currently, I didn't consider subregion rules) => Not priority for now.
-
If we load a new app, but another app that has same name as the new app, I think, it is reasonable to erase the old app and flash the new app again. However, we can't assure that the new app uses the same size of sram and flash memory as the old app. How to deal with this situation? Although they have same flash size, they can use different size of sram.
[2022-08-14]
- Add security feature to prevent a malicious OTA app from manipulating the flash region occupied by the existing apps
Documentation Updated
- [x] Updated the relevant files in
/docs/OTA_app
, Please refer to 'OTA_app_system_documnet.md' at docs directory. I summarized the OTA app design concept.
Formatting
- [] Ran
make prepush
.
Cool! OTA support would be great to have. It looks like a few parts of this can be merged as-is, so it's probably worth splitting this up into small PRs. That way some code can be merged and you can leave a draft PR open for the work in progress parts
I'm still working on this OTA app project. I applied feedbacks I received, and after discussing with Prof. Brad Campbell, I could update my work. Thanks!
I updated my work.
This is looking really good, and has come a long way!
I have updated my work!
I updated OTA app project. Please refer to the change log and md file
This is getting closer and closer. I think the next thing to look at is making sure we do not have to trust the userspace OTA app to not break any existing apps or compromise the stability of the kernel. It's ok if the userspace OTA app flashes an app incorrectly somehow, but that should only negatively impact the new app, and not any of the existing apps.
This means that the kernel needs to verify that the base header of the TBF (particularly the length and version) of the new app is valid and correct. For example, if the OTA upload is between existing apps, the new app cannot be allowed to break the linked list chain such that the trailing application (the one later in flash) is no longer loaded on a board reset.
The goal is that someone should be able to write an arbitrarily malicious OTA userspace app, but no matter what that app tries to do it should be impossible for it to corrupt or disable or otherwise affect existing apps on the board.
Thank you for your feedback! I left answers and my opinion.
This is getting closer and closer. I think the next thing to look at is making sure we do not have to trust the userspace OTA app to not break any existing apps or compromise the stability of the kernel. It's ok if the userspace OTA app flashes an app incorrectly somehow, but that should only negatively impact the new app, and not any of the existing apps.
=> Yes, I agree with this feedback. Actually, there is a function to prevent the new app which will be loaded by OTA app from invading the existing apps. Below code snippet checks whether or not a flash region for the new app invades the other regions occupied by the existing apps. If there is an invasion by the new app, we try to find another region. At main.rs, there are two global arrays, PROCESSES_REGION_START_ADDRESS
and PROCESSES_REGION_SIZE
, and I store the start address and size of the loaded processes. This information stored in this two array is used to check whether or not a new flash region for the new app corrupts other regions used by the existing apps.
This means that the kernel needs to verify that the base header of the TBF (particularly the length and version) of the new app is valid and correct.
=> Does the verification of the base header of the TBF of the new app has to be conducted before writing binary data or after doing that? I think it is reasonable to verify the base header before writing binary data (the new app)
For example, if the OTA upload is between existing apps, the new app cannot be allowed to break the linked list chain such that the trailing application (the one later in flash) is no longer loaded on a board reset.
=> Do I understand this example like this? => If a new app which will be loaded by OTA app has incorrect base header (e.g., incorrect kernel version or incorrect header length), this new app does not have to be written into flash. If this invalid new app is written into flash through some ways, we do not have to load this invalid new app after reset. => If my understanding is right, I think, I need to add a verification work of the based header of the new app before loading the new app, when receiving load request from OTA app.
The goal is that someone should be able to write an arbitrarily malicious OTA userspace app, but no matter what that app tries to do it should be impossible for it to corrupt or disable or otherwise affect existing apps on the board.
=> Yes, I agree with this goal, we need a couple of protection strategy. I will try to come up with an idea.
Below code snippet checks whether or not a flash region for the new app invades the other regions occupied by the existing apps
But the actual writing is done using the nonvolatile_storage
capsule, correct? What happens if the OTA app uses that to write to the wrong address?
I think it is reasonable to verify the base header before writing binary data (the new app)
I agree the check should come before.
But the actual writing is done using the nonvolatile_storage capsule, correct? What happens if the OTA app uses that to write to the wrong address?
To think about this topic, we need to consider 2 cases.
-
the benign OTA app writes data to the wrong address (regions occupied by the existing apps) by using
the nonvolatile_storage capsule
.
=> In this case, find_dynamic_start_address_of_writable_flash_advanced
of process_load_utilities.rs
has to guarantee that this start address and a flash region for a new app are 100% safe region which doesn't invade other regions occupied by the existing apps. check_overlap_region
does the verification work before passing the start address to OTA app. After receiving the start address, OTA app writes data to flash memory.
I know that there is a possibility that find_dynamic_start_address_of_writable_flash_advanced
returns a wrong start address to OTA app. However, I assumed that the size of app is a power of 2, and a start address (from 0x40000) is always increased by Max(requested app size, an existing app size). So, the possibility to find wrong address is low. doc/OTA_app/Alignment_Test.xlsx
shows the result of loading all of the combination of diverse size of apps. For now, find_dynamic_start_address_of_writable_flash_advanced
returns right start address well.
However, If elf2tab make the size of an app more diverse (not a power of 2), I mean, elf2tab breaks my assumption, I need to add more logic to find_dynamic_start_address_of_writable_flash_advanced
In sum, find_dynamic_start_address_of_writable_flash_advanced
has to guarantee that a region for a new app is 100% safe. Then, we can solve this case in my opinion.
-
the malicious OTA app writes data to the wrong address (regions occupied by the existing apps) by using
the nonvolatile_storage capsule
.
=> In this case, we need a protective strategy. So, that's the reason why I asked questions via email. I mean, nonvolatile_storage capsule
only has to be seen to the benign OTA app by doing some configuration or logic. I'm not sure how to implement it at kernel level. Although I set the permission header to the benign OTA app, I think, if the malicious OTA app does not set permission header, the malicious OTA app can write data by using nonvolatile_storage capsule
. (When I tested it, an app without permission header could use nonvolatile_storage capsule
)
As an alternative way, I could try to make nonvolatile_storage capsule
to refer to PROCESSES_REGION_START_ADDRESS
and
PROCESSES_REGION_SIZE
array. If the requested offset belongs to the region of the existing apps, the capsule has to deny the write request.
In sum, we need more protective strategies and ideas.
- Two issues (I hope you find my email describing these issues. I elaborated them)
-
Phantom Apps I will add 512 bytes 01 padding from the end of a new app after loading the new app.
-
Flash Memory Leakage caused by Phantom Apps If we iterate loading apps from OTA app and executing
tockloader erase-apps
, we face this issue. Since tockloader does not erase the entire region occupied by the existing apps, when finding a start address and a region for the new app, the logic skips this remnant app which is not loaded after reset. So, there are a couple of options to solve this issue.
-
tockloader erase-apps
erases the entire region occupied by the existing apps. But we have to sacrifice speed. -
tockloader erase-apps
does action same astockloader uninstall all
- Add more logic to check whether or not the remnant app is actually loaded into PROCESS global array.
For now, I applied 3 option to my work. In my opinion, most of the issues caused by phantom apps could be solved by 1 option. But I'm not sure 1 option is the best way, if we consider other facts.
I updated my work. And I solved two issues caused by phantom apps after tockloader erase-apps
. Basically, I added more logics to OTA app and process_load_utilities.rs
Issue 1: When loading 128k -> 64k via OTA app, 512 bytes 01 padding deletes the first 1 page size of 128k app
Issue 2: We face Flash Memory Leakage issue, if we iterate OTA upload and tockloader erase-apps
Now OTA app can be used together with tockloader erase-apps
, and it is totally compatible with tockloader erase-apps
Thank you.
As @bradjc mentioned, I updated my work
- Added validation check of TBF base header
- Prevent a malicious ota app from corrupting the existing apps
There is a threat model I want to talk about. If a malicious ota app corrupts the padding apps (Especially, header information), the linked list of the existing apps will be broken after reset. At the two weeks ago meeting, I showed an idea to find the existing apps by jumping based on the size of last app from the address being unable to parse, instead of inserting padding apps. So, as a fail-safety, I think, it would be better to add this logic to load_process_advanced
of process_utilities
. I would like to discuss about this topic at our Thrusday meeting.
Thank you.