dasharo-issues
dasharo-issues copied to clipboard
V54: BIOS settings are randomly reset
Component
Dasharo firmware
Device
NovaCustom V54 14th Gen
Dasharo version
v0.9.0
Dasharo Tools Suite version
No response
Test case ID
No response
Brief summary
With a new V54x_6x_TU laptop, it happened a few times now, that we recognized a reset of some important BIOS settings like secure boot, camera disable, Intel ME etc. Currently, we can't confirm that every setting got reset. We have noticed it on Ubuntu 24.04 in the security audit settings that secure boot was suddenly disabled after working with it. We can't trust the BIOS settings anymore because of the unstable state. There is definitely no long startup, which would indicate a memory re-training. Did anyone observe something like this?
How reproducible
Not reproducible / happens randomly
How to reproduce
We can't reproduce it currently.
Expected behavior
BIOS settings should be persistent.
Actual behavior
BIOS settings like secure boot or Intel ME are reset.
Screenshots
No response
Additional context
No response
Solutions you've tried
- CMOS battery connector is stable and fit correctly
- CMOS battery has stable voltage, checked with voltmeter
- Reset CMOS battery according to Novacustom; After that, laptop needs more time as expected to re-train the memory
@byteboltsec Could you please let us know what exact firmware settings were set differently when compared to the default settings? Were Early boot DMA Protection and Keep IOMMU enabled when transfer control to OS set as well?
@macpijan Any idea what could have caused the issue?
I have seen something like this on the MTL laptops happening randomly. It looked like the whole region with variables and settings was deemed invalid and thus reinitialized with defaults. But last time I have seen it was in the first early versions of the firmware (exactly around v0.9.0 for iGPU variants). Haven't seen it on the most recent versions (including developer builds) for a couple of months.
@mkopec I believe you have seen it as well at the beginning of the MTL firmware developemnt.
Hi @wessel-novacustom , I think that a "Reset to Defaults" is triggered randomly. We changed e.g.:
- Enable Camera -> false
- Intel ME mode -> Disabled (GAP)
- Battery Start Charge Threshold -> 78
- Battery Stop Charge Threshold -> 80
But I can't really confirm/remind that
Early boot DMA ProtectionandKeep IOMMU enabled when transfer control to OSwere changed
@miczyg1 @mkopec @macpijan Can we send test binaries of the coreboot + EDK II rebase or is that a bad idea?
@miczyg1 @mkopec @macpijan Can we send test binaries of the coreboot + EDK II rebase or is that a bad idea?
Binaries are available all the time on CI: https://github.com/Dasharo/coreboot/actions
It is as simple as "click one workflow and download an artifact". But it is always a bad idea to experiment on dev binaries without a recovery method.
@miczyg1 @mkopec @macpijan Can we send test binaries of the coreboot + EDK II rebase or is that a bad idea?
Binaries are available all the time on CI: https://github.com/Dasharo/coreboot/actions
It is as simple as "click one workflow and download an artifact". But it is always a bad idea to experiment on dev binaries without a recovery method.
Thanks. I can see that, I have promised to apply for recovery free of charge under warranty in case of a brick.
This issue has been reported twice now. Both customers had 2× 32 GB of internal memory. I'm not sure if this is coincidence or not, but I thought it was worth mentioning it.
In fact, there is a known issue with the SMI/SMM storage, causing this sporadic reset. The reproducibility is low, making the issue hard to debug. Here are some related issues:
https://github.com/Dasharo/dasharo-issues/issues/1364 https://github.com/Dasharo/dasharo-issues/issues/1349 https://github.com/Dasharo/dasharo-issues/issues/1338
It was a process of understanding of what's going on.
The underlying issue for this issue and the mentioned issues should have been fixed with this PR: https://github.com/Dasharo/dasharo-issues/issues/1338#issuecomment-3102785948
The issue is still occurring very rarely on v1.0.0-rc4.
The reproducibility is low, making the issue hard to debug
The reproducibility is extremely low, during a whole week of testing rc4 on V540TU didn't happen even once.
The issue is still occurring very rarely on v1.0.0-rc4.
@wessel-novacustom
That is unfortunate to hear. What do you mean by very rarely? Did it happen more than once during this period? Do you have any rough scenario to reproduce it, since you've been able to do so just in 3 days after the RC4 has been published it seems?
Possibly relevant report in the matrix channel: https://matrix.to/#/!HqyQrXVyqEGAXgjatF:matrix.org/$i6F_r5YEeq37mCZ_IeQW1E4aDt9pN5QoIdsxiNHw_FM
Just to pitch in: I've also had this issue occasionally on a V56.
As always, binaries will be in: https://github.com/Dasharo/coreboot/actions/runs/17325122509 if someone is willing to test it on their own.
I've ran a little stress test and haven't encountered the configuration resetting in 10 reboots. Should one happen, the serial redirection would default to disabled, and the connection would be lost.
V560TU bricked after changing settings, during NBA001.001 Network Boot
- Test enters setup menu, enables network boot
- After reboot the no battery prompt popped up in the top left corner, not in the middle of the screen
- Then setup menu likewise
- Test didn't find iPXE in setup menu (despite supposedly having enabled network boot) and did a power cycle
From that moment on laptop's been bricked, EC works, laptop doesn't boot, no screen backlight. Probably halts somewhere before setting fan curve, since they're a bit louder than usual.
Happened once so far
Reproduced again, with the same test. Binary dump:
[brick.zip](https://github.com/user-attachments/files/22136301/brick.zip
Tried to analyze with romscope:
λ ./romscope compare ../brick_copy ../dcu/working_read.rom
===== Preparation =====
Extracting file /home/flewinski/workspace/rc6testing/brick-analyze/brick_copy
Extracting file /home/flewinski/workspace/rc6testing/brick-analyze/dcu/working_read.rom
===== Comparison =====
IBG keys match.
Generating report for regions/fmap/SI_ME.bin
Generating report for regions/fmap/SMMSTORE.bin
Vblock regions/fmap/GBB.bin matches.
Vblock regions/fmap/VBLOCK_A.bin matches.
===== Conclusions =====
Not all files match. Check report for detailed information.
Report placed in folder: 'report'
λ ls report
regions-fmap-SI_ME.bin.html regions-fmap-SMMSTORE.bin.html
I'm still able to operate on the not-working binary with dcu/cbfstool, so it looks like it's not entirely garbled
SMMSTORE looks adequate at first glance, but since the only things different are ME region and SMMSTORE, something must be wrong with the latter (assuming ME is fine which is probably a valid assumption). Now that we write to SMMSTORE in coreboot (https://github.com/Dasharo/coreboot/pull/760), that's another suspect.
By the way, just realized there is no FaultTolerantWrite in coreboot, although it's not the issue here (of SMMSTORE would be effectively empty and settings were reset).
This issue may be debuggable. After writing bricked SMMSTORE into a QEMU image and trying to run it, the boot stops after:
[...]
[Bds] Entry...
[BdsDxe] Locate Variable Policy protocol - Success
Quiet Boot disabled.
Fast Boot disabled.
Console on demand disabled.
Variable Driver Auto Update PlatformLang, PlatformLang:en, Lang:eng Status: Success
This may be the first time EDK tries to write a variable and it apparently hangs while trying.
I've just managed to reproduce this manually, by enabling the non-functional DMA protection option prior to toggling Network Boot back and forth. This seems to be what the automatic full regression does in OSFV, with the DMA test right before Network Boot.
We're disabling this option for the release, so maybe that will cut off some faulty path.
I've just managed to reproduce this manually, by enabling the non-functional DMA protection option prior to toggling Network Boot back and forth
A NUC Box user reported the same today.
Maybe good to re-enable it with a hotfix release within a few months from now? We should have a release before the Qubes summit.
IIRC, DMA protection is currently off-limits since there problem is within the Intel FSP for Meteor Lake, and we're not legally able to redistribute modified FSP. So we must wait for Intel to fix it in a newer version.
Ok, that's fine. Then let's integrate it again once Intel has fixed it.
and we're not legally able to redistribute modified FSP.
Correction, we are allowed, but we do not have the latest sources available to us.