dasharo-issues icon indicating copy to clipboard operation
dasharo-issues copied to clipboard

Preserve user data during firmware updates

Open BeataZdunczyk opened this issue 10 months ago • 14 comments

Brief summary

Some of the data should be preserved between firmware versions. This includes:

BeataZdunczyk avatar Apr 17 '24 15:04 BeataZdunczyk

@krystian-hebel, some things to discuss here.


How to know that a new version is compatible with the old one? Will flashing driver have a table of versions eventually or should this information be embedded with a capsule in that vendor code? I guess, we can postpone actually dealing with this once we have an incompatible release, but thought I'll bring it up now.

By the way, from GenerateCapsule.py:

# This tool is intended to be used to generate UEFI Capsules to update the
# system firmware or device firmware for integrated devices. In order to
# keep the tool as simple as possible, it has the following limitations:
#   * Do not support vendor code bytes in a capsule.

The update code should be changed to:

  1. Start by reading current flash contents in full (not in pieces as done right now)
  2. Move the data and EFI variable from old image to new one when applicable
  3. Write new firmware image as now, using data from the previous steps

SMMSTORE - holds UEFI variables, e.g. user settings

SetVariable() likely needs to be reinitialized to write to new SMMSTORE (before it's flashed). Or just copy the whole region?

If copying the region, any EDK caches to flash initially?

Otherwise need to make EDK use new SMMSTORE region before it's written by somehow swapping firmware volume protocol (old variables need to be read before doing that).

Needs FMAP code.

ROMHOLE (MSI only) - required for FlashBIOS recovery

Need a set of MSI GUIDs to check against?

Parse FMAP in both old and new images and copy the data over if sizes/offsets match? Could hard-code, but it doesn't seem reliable.

SMBIOS - unique data like serial number or system UUID

Needs both FMAP and CBFS code. The latter is mostly GPL2-only.

boot logo - customizable by users

Also needs FMAP code, but can do without CBFS code if region size didn't change.


I think we don't need to upstream ROMHOLE/SMBIOS/logo, so GPL code shouldn't be a problem, but would this be the first time we add it to our fork? Any problems with that and need to work around it?

SergiiDmytruk avatar Sep 12 '24 14:09 SergiiDmytruk

in the best case, if the update driver determines FMAP to be the same in current and new versions, it can attempt to only flash code partitions (RW_SECTION_A, RW_SECTION_B, WP_RO in the Vboot case, COREBOOT in the non-vboot case, and SI_ALL if present). That way we preserve everything we need, and the update is faster and more reliable than attempting to transplant this data. We also get to keep the MRC cache, and we already have users complaining that the boot time after updating is long, caused by retraining. But then we'd have to somehow ensure that the FMAP never changes :)

mkopec avatar Sep 12 '24 14:09 mkopec

in the best case

Not sure we can assume the best case.

That way we preserve everything we need, and the update is faster and more reliable than attempting to transplant this data.

Don't know if it's more reliable or not. It's patching old image vs the new one, either way incompatibilities are possible.

We also get to keep the MRC cache

I guess it can be copied as well.


Since we'll need at least FMAP, I did implementation using just it here. Also includes CBFS code, which can be compiled, but it's not used yet. @JanPrusinowski, building from that branch should already make some tests pass.

SergiiDmytruk avatar Sep 15 '24 22:09 SergiiDmytruk

In its current state the branch copies everything in the OP, but SMMSTORE is copied as a region. SMBIOS data is moving using GPL code from cbfstool.

Replacing EFI protocols seems to not be feasible because VariablesDxe caches it and the code in general isn't meant to allow dynamic replacement of that sort. I was about to use efivars.c from coreboot instead, but later switched to smmstoretool which also has variable store creation in it. The code compiles and runs (copying all variables), but QEMU doesn't boot afterwards. Need to debug why.

SergiiDmytruk avatar Sep 17 '24 22:09 SergiiDmytruk

I guess, we can postpone actually dealing with this once we have an incompatible release, but thought I'll bring it up now.

There can be incompatibilities at different levels, some of them can be harder to deal with than others. We have LowestSupportedVersion, but that is a big hammer, which isn't always necessary. I haven't looked too closely at the code, but:

  • Changes to FMAP should be relatively easy to deal with, as long as old preserved data fits in new region.
  • Logo - part of the above.
  • SMBIOS - also relatively easy, CBFS is pretty stable, and if it changes we can bump LowestSupportedVersion.
  • SMMSTORE: as an example, https://github.com/Dasharo/DasharoModulePkg/commit/1a488a4da5643010d32004ee41bba525bf84b44a merged multiple variables into one. Changes like this one should be doable by moving old variables to new ones (deleting ones that are no longer necessary, calculating new values if needed) before applying the update, and then preserving whole SMMSTORE region.
    • When format of SMMSTORE changes, we can bump LowestSupportedVersion.
  • ROMHOLE - this is more about skipping than updating it. It is unlikely that it will be moved elsewhere, I wouldn't worry about it for now.

Replacing EFI protocols seems to not be feasible because VariablesDxe caches it and the code in general isn't meant to allow dynamic replacement of that sort.

Wouldn't it be enough to add new variables driver to capsule and let it repeat what VariableServiceInitialize() does (+ maybe recalculate CRC of RT table header)? All of the libraries are compiled into each driver, so unless something between FMP driver and reboot accesses external protocols which may have cached pointers to variable services functions (AFAICT nothing caches SetVariable, I haven't checked other functions), it should work.

Now that I think of it, it is coreboot's SMI handler that ultimately knows where SMMSTORE is (for writing, at least), and we won't be able to move it without rebooting into new coreboot, in case it is moved between versions.

I think we can leave it at copying whole region for now, and maybe leave accessing single variables as potential future improvements. We can access variables just fine before new image is flashed, so this should be just the question of running the capsule drivers in the correct order.

krystian-hebel avatar Sep 18 '24 13:09 krystian-hebel

Wouldn't it be enough to add new variables driver to capsule and let it repeat what VariableServiceInitialize() does

There is an event handler that needs to be triggered, global data and it also initializes libraries. Maybe it can be made to work, but that will take more effort and doesn't really look reliable.

I think we can leave it at copying whole region for now, and maybe leave accessing single variables as potential future improvements.

I made implementation from smmstoretool work. I just missed setting one field and also didn't skip copying of volatile variables. It's ~300 lines long.

SergiiDmytruk avatar Sep 18 '24 16:09 SergiiDmytruk

Send PR: https://github.com/Dasharo/edk2/pull/166

Regarding MRC cache. As far as I can tell, something like this should work:

if UNIFIED_MRC_CACHE region exists
    copy it
else
    if RECOVERY_MRC_CACHE region exists
        copy it
    copy RW_MRC_CACHE region

copy RW_VAR_MRC_CACHE region if exists ?

I don't know much about it and how safe it is to have these regions potentially containing incompatible garbage, but if somebody thinks it should work, it's an easy thing to add this to the PR.

SergiiDmytruk avatar Sep 19 '24 13:09 SergiiDmytruk

I have created tests for: SMMSTORE, SMBIOS and boot logo Availible: https://github.com/Dasharo/open-source-firmware-validation/pull/501

Also I used the latest release suggested by @SergiiDmytruk - however I must have done something wrong as I didnt notice any changes from previous builds except for that the logo tests now work. I have made some fixes so that tests will run on MSI. I have run those tests and - providing that the FW will support it test should work.

I still dont know how to create ROMHOLE data and put it into the fw. I have aed about ROMHOLETOOL in: https://github.com/Dasharo/open-source-firmware-validation/pull/501

JanPrusinowski avatar Sep 24 '24 10:09 JanPrusinowski

however I must have done something wrong as I didnt notice any changes from previous builds except for that the logo tests now work.

Please attach logs.

krystian-hebel avatar Sep 24 '24 11:09 krystian-hebel

@JanPrusinowski Note that there is an issue in coreboot preventing it from working on MSI without extra changes (works fine in QEMU): https://github.com/Dasharo/edk2/pull/166#discussion_r1768948404. The capsule update should just fail during an update and the fact that it didn't, shows that you probably built firmware with old revision of EDK2 (go to payloads/external/edk2/workspace/Dasharo and see current commit hash via git show or alike). It won't work yet though even with the correct revision unless you modify coreboot locally as I did (see the comment, this primarily needs to be done for initial firmware).

Log from Jira: capsule-preserve-data-log.zip

SergiiDmytruk avatar Sep 24 '24 12:09 SergiiDmytruk

https://github.com/Dasharo/edk2/pull/166 has been merged, but MSI requires fixes to coreboot to be able to read whole flash when it is bigger than 16 MiB.

krystian-hebel avatar Sep 24 '24 13:09 krystian-hebel

MSI requires fixes to coreboot to be able to read whole flash when it is bigger than 16 MiB.

Pushed https://github.com/Dasharo/coreboot/pull/509/commits/8c87fbd3499cc711143e561482cf3aaf630e608d to https://github.com/Dasharo/coreboot/pull/509, can be reviewed there.

SergiiDmytruk avatar Sep 24 '24 18:09 SergiiDmytruk

Logs requested by Krystian: log(9)1.zip

JanPrusinowski avatar Sep 25 '24 06:09 JanPrusinowski

New log from current PR (6a640f33c1f829692e99e8706af372064e3ef1ca): log_hopefully_final.zip

It was tested on Z690-A DDR4 with image built from https://github.com/Dasharo/coreboot/pull/509 (68e501abc20c9dc1d74c6b96b863b12f094bae67) with edk2 pointed at https://github.com/Dasharo/edk2/pull/172 (15ff38aa8a9ed5efea9619380978ce8f04286d2b).

krystian-hebel avatar Oct 04 '24 15:10 krystian-hebel