dasharo-issues
dasharo-issues copied to clipboard
OS block booting verification
Brief summary
Test a mechanism that prevents the operating system from booting until a reboot occurs after the firmware update process. The primary objective is to ensure that the system remains secure and stable after firmware updates are applied.
Additional context
As a test, CapsuleApp.efi
should be called with:
- non-FMP capsule (e.g. UX),
- FMP capsule signed with wrong key (like the one used for https://github.com/Dasharo/dasharo-issues/issues/804),
- invalid FMP capsule (wrong GUID, size or version, if anti-rollback is enabled).
In all of those cases, platform must reboot without booting to OS/Shell.
Initial tests can be performed on QEMU, but it must have capsule options enabled in the config. So, basically what has to be done in this task is:
-
Clone coreboot from https://github.com/Dasharo/coreboot/tree/uefi-capsules
-
Add missing options to the Q35 config (
CONFIG_DRIVERS_EFI_FW_INFO=y
andCONFIG_DRIVERS_EFI_UPDATE_CAPSULES=y
). -
Make a copy of that file because it will be updated by the capsule, and you probably will need to restore previous version.
-
Bump
CONFIG_LOCALVERSION
in config to something higher and build again. -
Create a capsule from the newer version. You may need to set LowestSupportedVersion as "0x00000000", just in case.
-
Create a drive image formatted as FAT32 and copy capsule created in 6. together with
CapsuleApp.efi
(will be located incoreboot/payloads/external/edk2/workspace/Build/DasharoPayloadPkgX64/RELEASE_COREBOOT/X64/MdeModulePkg/Application/CapsuleApp/CapsuleApp/OUTPUT/CapsuleApp.efi
after building firmware, it doesn’t matter if you use version from old or newLOCALVERSION
as they are identical). -
Start QEMU with copy of old firmware, mounted drive with capsule (not
virtio
,ide
should be used), enter UEFI shell. -
Confirm the version with
smbiosview -t 0
(fieldBIOS Version:
) - should contain oldCONFIG_LOCALVERSION
. -
Run
CapsuleApp.efi dasharo.cap
. It should automatically cause the reboot, after which you should be presented with bootsplash and progress bar, wait for it to finish. After that the platform will reboot itself. -
Enter UEFI shell and confirm the version with
smbiosview -t 0
(fieldBIOS Version:
) - should contain newCONFIG_LOCALVERSION
.
That was the expected flow for valid capsule. Please start with that and report issues (if any) before trying to test invalid ones.
Invalid keys
From the previous state:
-
Copy original old firmware back to working copy (i.e. “downgrade”) - easier and faster than rebuilding.
-
Create new capsule as previously, but specify keys from 2. in JSON file. You may want to have separate JSON and capsule files to be able to quickly re-run the tests.
-
Copy new capsule to drive mounted in QEMU (again, may be worth to have a different name so that multiple capsules can coexist on one drive image).
-
Repeat steps 8-10 from valid capsule flow. There should be no update (no progress bar), but the platform should reboot twice anyway. I’m not sure how to reliably test how many reboots there are, at least not on release builds. In case of QEMU it should be clearly noticeable due to this issue, but for MSI we will have to rely on
CapsuleApp.efi -S
, as described in this comment. -
Enter UEFI shell and confirm the version with
smbiosview -t 0
(fieldBIOS Version:
) - should contain oldCONFIG_LOCALVERSION
.
Invalid GUID
Same as invalid keys, except when building capsules modify JSON:
-
Use valid keys (original
BaseTools/Source/Python/Pkcs7Sign/Test*.pem
files). -
Change GUID to anything else.
Invalid version (anti-rollback)
We won’t be able to test it until we have at least 2 releases with support for capsule updates (preferably at least one real release and one dev release for testing). This will also be easier after capsule creation is more automatic than currently, which will probably happen around https://github.com/Dasharo/dasharo-issues/issues/807.
This test would make use of DRIVERS_EFI_MAIN_FW_VERSION
, DRIVERS_EFI_MAIN_FW_LSV
in config and LowestSupportedVersion
in JSON file. At this point I have no idea how to test it, short of building few pseudo-releases with different version numbers, and signing them with release keys. If such pseudo-release would somehow end up public, it would be hard to track whether this is test release or real one, so maybe this isn’t a good idea. There is also a question of RC binaries, they would have to have version different than the final one, yet numerically lower than a final release... Anyway, this isn’t something you have to worry now, but at some point in the future this will have to be added.
Clone coreboot from https://github.com/Dasharo/coreboot/tree/uefi-capsules Add missing options to the Q35 config (CONFIG_DRIVERS_EFI_FW_INFO=y and > CONFIG_DRIVERS_EFI_UPDATE_CAPSULES=y). Build the image. Make a copy of that file because it will be updated by the capsule, and you probably will need to restore previous version.
Do we plan to enable this driver in the qemu build in the long run? Maybe we can already to that, so qemu with capsule update can be build in CI on this branch?
Or add another build config to do so?
@JanPrusinowski could you please indicate how many, and what test cases you plan to prepare in the form of a checklist to track progress? Please remember to gather logs from testing newly made test cases and indicate which version of the capsule update you were testing.
@BeataZdunczyk
- [x] TC1: Update DUT with a capsule and check if the update was sucessfull by checking if BIOS Version has changed.
- [x] TC2: Try to update DUT with capsule with invalid keys. Verify if the DUT reboots twice. Verify that the BIOS Version wasn't changed.
- [x] TC3: Try to update DUT with invalid GUID. Same as TC2 but with wrong GUID and valid keys.
- [ ] TC4: Try to update DUT with invalid fw version. This test might not work as @krystian-hebel stated that we might need at least 3 releases to be able to check if it works.
TC1 is somewhat ready. I'm working on automating creation of capsules needed for testing.
@krystian-hebel @SergiiDmytruk, did you consider UEFI SCT Capsule conformance?
https://github.com/search?q=repo%3Atianocore%2Fedk2-test+CAPSULE&type=code
@JanPrusinowski I have updated your comment to be in the form of a checklist. @krystian-hebel will propose here switching the order of test here so that we only need to flash the platform once.
@JanPrusinowski I think we should move TC1 to the very end, maybe even leave few unused numbers for future tests (e.g. make it TC50). That way we would save few flashing cycles because final image would be identical to ROM and flashrom updates only parts that are changed. This would both save time and reduce flash wear.
TC4 may be split into few different cases (e.g. valid downgrade, attempt to downgrade beyond minimal version, transition between RC and normal releases - both ways). But still, for now we don't have to worry about it as we won't be able to test it for few next releases anyway.
I'm working on automating creation of capsules needed for testing.
Why? Have you two agreed on this, @JanPrusinowski and @krystian-hebel?
I am wondering if we really need this at this point.
We do have Automate the creation and execution of the UX capsule
task in a phase 5, so I guess tests that verify the automation should be prepared then.
@BeataZdunczyk we agreed on it. UX will be handled very differently than what's needed here. We need some automation here, otherwise tests would require both valid and invalid capsules to be passed externally. This would possibly give false positive results if the tester didn't built capsules properly.
@krystian-hebel @SergiiDmytruk, did you consider UEFI SCT Capsule conformance?
I know I didn't. It looks like test of UEFI API itself, not sure it makes sense to introduce the use of UEFI SCT specifically for capsules.
I wanted to introduce UEFI SCT for a long time, but if you say it doesn't make sense, I'm good with that. I know that ProjectMu uses UEFI SCT to validate their edk2 fork, so I thought it could be beneficial in our case, but maybe I'm wrong here, and it doesn't add any value.
I suppose it can be beneficial to add it at some point for validation of the fork, but I don't expect it to catch anything for capsules (because we barely touched implementation of those calls) which is why it seems not worth the effort in this case (unless it's really easy to integrate).
I have prepared tests for: TC1: Update DUT with a capsule and check if the update was sucessfull by checking if BIOS Version has changed. TC2: Try to update DUT with capsule with invalid keys. Verify if the DUT reboots twice. Verify that the BIOS Version wasn't changed. TC3: Try to update DUT with invalid GUID. Same as TC2 but with wrong GUID and valid keys. Also run them both on Qemu and MSI More details can be found at: https://github.com/Dasharo/open-source-firmware-validation/pull/457
To run tests on qemu prepare a valid capsule file and then use this capsule file to generate invalid capsules required in tests by running the script:
./scripts/capsules/capsule_update_tests.sh dasharo.cap
then start the tests:
robot -v snipeit:no -L TRACE -v rte_ip:127.0.0.1 -v config:qemu -v capsule_fw_file:dasharo.cap dasharo-stability/capsule-update.robot
To start tests on MSI before preparing the capsule please edit FW to enable Console Serial Redirection. Use the guide: https://github.com/Dasharo/open-source-firmware-validation/blob/develop/docs/troubleshooting.md without it a successful flash of DUT will prevent tests from working correctly. Before starting the test run the capsule prepare script (same as on qemu).
robot -v snipeit:no -L TRACE -v rte_ip:192.168.10.188 -v config:msi-pro-z690-a-ddr5 -v sonoff_ip:192.168.10.69 -v pikvm_ip:192.168.10.45 -v device_ip:192.168.10.39 -v fw_file:./msi_ms7d25_v1.1.2_ddr5.rom -v capsule_fw_file:./msi_ms7d25_v1.1.3_ddr5.cap dasharo-stability/capsule-update.robot
Tests on MSI fail for now as FW doesn't support capsule update yet. @SergiiDmytruk @krystian-hebel I could test if it works if I would get a modified FW for MSI - however on qemu everything works as it should and tests itself start on MSI so everything should work when the modified FW will be available.
Tests that were added: TC1: Update DUT with a capsule and check if the update was sucessfull by checking if BIOS Version has changed. TC2: Try to update DUT with capsule with invalid keys. Verify that the BIOS Version wasn't changed. TC3: Try to update DUT with invalid GUID. Same as TC2 but with wrong GUID and valid keys.
PR availible: https://github.com/Dasharo/open-source-firmware-validation/pull/457
MSI tests were conducted on: MSI PRO Z690-A DDR5 Logs can be found in my previous comment
Both on QEMU and MSI I was able to verify that DUT wont reset if the capsule is build with dropped --capflag InitiateReset
flag.
However on MSI capsule update is not supported yet in FW so tests cant be completed:
MSI tests were conducted on:
@JanPrusinowski This is an internal link. Please just add information about the platform
I have updated tests and the script generating capsules required by tests. Tests now work on MSI. log-msi-pass.zip
First somewhat successful test. The FS on which Ubuntu is installed showed in Uefi Shell is not consistent and it may need to be somehow dynamically determined or found. Locally I added a loop iterating over all the FS's.
- CUP001.001 Failed because
Capsule Status
wasSecurity Violation
and notNot Ready
. - CUP002.001 Passed
- CUP050.001 Failed because
No match found for 'to boot directly' in 3 minutes
, the platform booted into Ubuntu beforeto boot directly
was found. Serial redirection was turned off. Maybe I have did something wrong when setting it in the FW using dcu. I will set it in the binary again and run the tests another time.
@philipandag This is strange. Because first two tests should have not changed the FW at all... And capsule update should not turn off the Serial redirection if it was turned on previousely. Are you running each test seprately or are you running the whole suite? Maybe you have flashed FW in between running tests?
I am running the whole suite. From what I recall in the documentation on capsule updates it said that the setup menu options will be restored to defaults after an update. The serial redirection was probably turned off as a result of performing a capsule update created from a fw image which had them disabled. I forgot to replace it with the changed version. I am running the suite again now.
The suite failed again on CUP050.001 because the platform took over 3 minutes to boot after the update. CUP001.001 Fails identically like before. I have modified the test a bit so that it would continue even if the platform needs a lot of time to boot after an update/flash but then it started to freeze on the boot logo and was not showing any text on the screen.
After reflashing manually the CUP050 test has passed after I have turned on the serial redirection manually. I don’t know why it is overwritten. I have replaced the ./build/coreboot.rom
file which is used for the update with a modified one regenerated the capsule and replaced it in the open-source-firmware-validation
repo where I run the tests.
(venv) fgolas•review-capsule-user-guide/open-source-firmware-validation/dcu(main)» ./dcuc variable coreboot.rom --get SerialRedirection [17:35:20]
Enabled
(venv) fgolas•review-capsule-user-guide/open-source-firmware-validation/dcu(main)»
( It is a copy of the file. I copied it to dcu directory to check the variable) CUP050.zip
CUP001.001 Still fails in the same exact way as before in my case.
@philipandag Could you provide logs from the failed tests?
@philipandag, you don't mention these things, so I want to check that they were taken into account:
- Re-generation of test capsules via
scripts/capsules/capsule_update_tests.sh
- Uncommenting of
in your clone of OSFV repo.... AND ... Flash Firmware If Not QEMU
The suite failed again on CUP050.001 because the platform took over 3 minutes to boot after the update.
I think it just doesn't boot after flashing. Power On
keyword doesn't seem to take Sonoff into account and I saw that it was off all the time after the test has flashed firmware and waited for the platform to boot. Power Cycle On
seems to be necessary after flashing. I don't deal with OSFV often enough to know such things for sure but it's used in some other places after flashing and I'm testing that now.
Update: adding Power Cycle On
seems to help.
- I did re-generate the capsules
- I did not uncomment anything
I think it just doesn't boot after flashing.
It eventually does. Just after a long time. Adding Wait Until Keyword Succeeds
before Enter Boot Menu Tianocore
helps with that by effectively multiplying the timeout by an integer. If using Power Cycle On speeds things up then that's great, let's use it instead.
From what I recall in the documentation on capsule updates it said that the setup menu options will be restored to defaults after an update. The serial redirection was probably turned off as a result of performing a capsule update created from a fw image which had them disabled.
These settings will be preserved for final implementation, but aren't yet. This will be done as part of https://github.com/Dasharo/dasharo-issues/issues/809.
I think it just doesn't boot after flashing.
Power On
keyword doesn't seem to take Sonoff into account and I saw that it was off all the time after the test has flashed firmware and waited for the platform to boot.Power Cycle On
seems to be necessary after flashing. I don't deal with OSFV often enough to know such things for sure but it's used in some other places after flashing and I'm testing that now.Update: adding
Power Cycle On
seems to help.
If using
Power Cycle On
speeds things up then that's great, let's use it instead.
Let's not. There should be no need to cut the power after an update, and we should test it in a way that would be the closest to what we expect the end-users will do.
Let's not. There should be no need to cut the power after an update, and we should test it in a way that would be the closest to what we expect the end-users will do.
This isn't part of the test, but part of the setup to flash initial ROM which turns Sonoff off but nothing turns it on according to Robot's log.
Using the binaries Sergii sent I get a fail in CUP001 & CUP002 and a PASS in CUP999 sergii_binaries.zip
Using the binaries Sergii sent I get a fail in CUP001 & CUP002 and a PASS in CUP999
That was kinda expected. The result was inverted compared to your binaries because tests at that point didn't expect FUM on failures and looks like you didn't build with updated EDK2 (old commit was likely still checkout out).