heads icon indicating copy to clipboard operation
heads copied to clipboard

v560tu seems slow on IO, needs confirmation

Open tlaurion opened this issue 10 months ago • 39 comments

Only thing I can do is compare across devices.

This is systemd-analyze blame output, for reference

My nv41 has bigger templates and is my main driver.

nv41 64 GB ram, drive: Samsung SSD 980 PRO 2TB

Image

v560tu 64gb ram, drive: SSDPR-PX700-04T-80

Image

cc @macpijan

tlaurion avatar Jan 20 '25 20:01 tlaurion

I assume it can be useful to document disk/memory in both setups as well

macpijan avatar Jan 20 '25 21:01 macpijan

I assume it can be useful to document disk/memory in both setups as well

Modified OP @macpijan

tlaurion avatar Jan 20 '25 23:01 tlaurion

Not sure. Reencryption works @1.4GiB/s, with power draw of 35w, no fan spinning.

PXL_20250120_230242522.jpg

tlaurion avatar Jan 20 '25 23:01 tlaurion

What are the models of SSDs in both laptops? V560TU could have a DRAM-less disk which are much slower in random I/O

mkopec avatar Jan 21 '25 11:01 mkopec

I think it's reasonably to assume disks are from what @wessel-novacustom is offering in the configurator.

V56 offers only one option: https://novacustom.com/product/v56-series/

While NV41 used to offer Samsung disks: https://novacustom.com/product/nv41-series/

Can we please summarize the exact hw configurations being compared here?

macpijan avatar Jan 21 '25 12:01 macpijan

NV41 can come with 980 Pro (with DRAM cache) or 980 (DRAM-less). Apparently the PX700 disk in V560TU does not have DRAM cache but goodram sells it as HMB 3.0 which uses host RAM for caching. I have no idea whether or not that works in Qubes

mkopec avatar Jan 21 '25 12:01 mkopec

@macpijan @mkopec

I'm not sure if that's actually the issue.

If I compare I/O write and especially read operations (especially with small files), the -TU models are performing bad compared to the NVIDIA variants. Now that's for UEFI of course, but might still make sense to compare to see if there is a difference and why.

wessel-novacustom avatar Jan 21 '25 12:01 wessel-novacustom

Here's what it looks like with V560TU on my side (96GB / 2TB)

20250121_133744.jpg

mkopec avatar Jan 21 '25 12:01 mkopec

@mkopec @macpijan @wessel-novacustom : updated OP

nv41 64 GB ram, drive: Samsung SSD 980 PRO 2TB

and

v560tu 96gb ram, drive: SSDPR-PX700-04T-80

Opened issue https://github.com/QubesOS/qubes-issues/issues/9723

Want me to run some tests? Please detail.

Also note that as pointed out at https://github.com/linuxboot/heads/pull/1889#issuecomment-2598849847:

https://github.com/Dasharo/coreboot/compare/94e5f5d5b808cf8d8fd5c70d4ef6a08a054f8986...048ca832325d716fcab596822b10f5d493fc2312

When we attempted that coreboot version bump, perf worsen to the point of systemd services timing out on first stage of qubesos installation, and templates installation took more than an hour.

tlaurion avatar Jan 21 '25 15:01 tlaurion

Also note that as pointed out at #1889 (comment):

Dasharo/[email protected]

When we attempted that coreboot version bump, perf worsen to the point of systemd services timing out on first stage of qubesos installation, and templates installation took more than an hour.

Some additional notes from testing, will edit (re-testing https://github.com/tlaurion/heads/tree/perf_comparison_with_reverted_coreboot_version_bump_v560tu)

  • ram init takes around 2m on 64gb ram, not 1
  • Heads fb refresh (fbwhiptail drawing) slower.
  • Heads hashing of /boot content slower
  • Resealing TPM Disk Unlock Key (cpu based derivation of DRK to validate slot + new key from luks DRK into new slot) way slower

Insight is not only IO being slower, but as if CPU speed was limited to be turtle speed as well.

systemd-analyze blame with branch including changes to coreboot version bump:

Image

full log:

systemd-analyze_blame.txt

CC @macpijan @mkopec

tlaurion avatar Jan 21 '25 18:01 tlaurion

systemd-analyze blame with https://github.com/linuxboot/heads/commit/36e30d0174dbd351ea13ee1a5659979843939da3 for comparison.

Exerpt:

32.838s [email protected]
26.821s [email protected]
26.016s dev-disk-by\x2duuid-9254e32c\x2df3a1\x2d4ddf\x2d917d\x2d1496ed47d5d6.device
26.016s dev-disk-by\x2dpartuuid-3f923856\x2dba50\x2d4c86\x2d9d79\x2dcb60c453af91.device
26.016s dev-nvme0n1p3.device
26.016s sys-devices-pci0000:00-0000:00:06.0-0000:01:00.0-nvme-nvme0-nvme0n1-nvme0n1p3.device
26.016s dev-disk-by\x2did-nvme\x2dnvme.1e4b\x2d473441303037343132\x2d53534450522d50583730302d3034542d3830\x2d00000001\x2dpart3.device
26.016s dev-disk-by\x2dpath-pci\x2d0000:01:00.0\x2dnvme\x2d1\x2dpart3.device
26.016s dev-disk-by\x2did-nvme\x2dSSDPR\x2dPX700\x2d04T\x2d80_G4A007412\x2dpart3.device
25.993s dev-disk-by\x2dpartuuid-3a30b820\x2d3a74\x2d41fb\x2d938d\x2d5e0288966fd4.device
25.993s sys-devices-pci0000:00-0000:00:06.0-0000:01:00.0-nvme-nvme0-nvme0n1-nvme0n1p2.device
25.993s dev-nvme0n1p2.device
25.993s dev-disk-by\x2did-nvme\x2dnvme.1e4b\x2d473441303037343132\x2d53534450522d50583730302d3034542d3830\x2d00000001\x2dpart2.device
25.993s dev-disk-by\x2did-nvme\x2dSSDPR\x2dPX700\x2d04T\x2d80_G4A007412\x2dpart2.device
25.993s dev-disk-by\x2duuid-10f43b72\x2d51a0\x2d40ce\x2d8407\x2dc31d8e29b67e.device
25.993s dev-disk-by\x2dpath-pci\x2d0000:01:00.0\x2dnvme\x2d1\x2dpart2.device
25.980s dev-disk-by\x2dpath-pci\x2d0000:01:00.0\x2dnvme\x2d1\x2dpart4.device
25.980s dev-disk-by\x2duuid-fb504cfb\x2d542a\x2d4b62\x2da176\x2d515781072d00.device
25.980s dev-nvme0n1p4.device
25.980s sys-devices-pci0000:00-0000:00:06.0-0000:01:00.0-nvme-nvme0-nvme0n1-nvme0n1p4.device
25.980s dev-disk-by\x2did-nvme\x2dSSDPR\x2dPX700\x2d04T\x2d80_G4A007412\x2dpart4.device
25.980s dev-disk-by\x2dpartuuid-6675f426\x2d7f58\x2d47ec\x2dba1a\x2d54a321a95c54.device
25.980s dev-disk-by\x2did-nvme\x2dnvme.1e4b\x2d473441303037343132\x2d53534450522d50583730302d3034542d3830\x2d00000001\x2dpart4.device
25.832s dev-disk-by\x2did-nvme\x2dSSDPR\x2dPX700\x2d04T\x2d80_G4A007412\x2dpart1.device
25.832s dev-nvme0n1p1.device
25.832s dev-disk-by\x2dpartuuid-bf182adf\x2da263\x2d4c81\x2da12a\x2d789f69fb6263.device
25.832s sys-devices-pci0000:00-0000:00:06.0-0000:01:00.0-nvme-nvme0-nvme0n1-nvme0n1p1.device
25.832s dev-disk-by\x2did-nvme\x2dnvme.1e4b\x2d473441303037343132\x2d53534450522d50583730302d3034542d3830\x2d00000001\x2dpart1.device
25.832s dev-disk-by\x2dpath-pci\x2d0000:01:00.0\x2dnvme\x2d1\x2dpart1.device
25.745s dev-ttyS4.device
25.745s sys-devices-pci0000:00-0000:00:1e.0-dw\x2dapb\x2duart.3-dw\x2dapb\x2duart.3:0-dw\x2dapb\x2duart.3:0.0-tty-ttyS4.device
25.738s dev-ttyS19.device
25.738s sys-devices-platform-serial8250-serial8250:0-serial8250:0.19-tty-ttyS19.device
25.737s dev-ttyS16.device
25.737s sys-devices-platform-serial8250-serial8250:0-serial8250:0.16-tty-ttyS16.device
25.736s sys-devices-platform-serial8250-serial8250:0-serial8250:0.1-tty-ttyS1.device
25.736s dev-ttyS1.device
25.735s sys-devices-platform-serial8250-serial8250:0-serial8250:0.13-tty-ttyS13.device
25.735s dev-ttyS13.device
25.734s dev-ttyS10.device
25.734s sys-devices-platform-serial8250-serial8250:0-serial8250:0.10-tty-ttyS10.device
25.733s sys-devices-platform-serial8250-serial8250:0-serial8250:0.18-tty-ttyS18.device
25.733s dev-ttyS18.device
25.733s sys-devices-platform-serial8250-serial8250:0-serial8250:0.15-tty-ttyS15.device
25.733s dev-ttyS15.device
25.729s sys-devices-platform-serial8250-serial8250:0-serial8250:0.0-tty-ttyS0.device
25.729s dev-ttyS0.device
25.729s sys-devices-platform-serial8250-serial8250:0-serial8250:0.20-tty-ttyS20.device
25.729s dev-ttyS20.device

So similar (but still slower) compared to @mkopec dump at https://github.com/linuxboot/heads/issues/1894#issuecomment-2604624721: 20250121_133744.jpg I guess some discards/trim operations changed the numbers. Will reinstall with qubes defaults and LVM and dump last numbers on my side on next comment.

systemd-analyze_blame_master.txt

CC @macpijan @mkopec

tlaurion avatar Jan 21 '25 19:01 tlaurion

If I compare I/O write and especially read operations (especially with small files), the -TU models are performing bad compared to the NVIDIA variants. Now that's for UEFI of course, but might still make sense to compare to see if there is a difference and why.

I agree this is to be checked. Did we have a tracking of this problem already?

Also, do we want to do it right now? Should this be gating heads release if (to be confirmed) performance would be similar on the same device across UEFI and heads firmware?

Ideally, we release heads release from the same (similar) coreboot base as the previous UEFI release.

macpijan avatar Jan 22 '25 11:01 macpijan

Some additional notes from testing, will edit (re-testing https://github.com/tlaurion/heads/tree/perf_comparison_with_reverted_coreboot_version_bump_v560tu)

* ram init takes around 2m on 96gb ram, not 1

* Heads fb refresh (fbwhiptail drawing) slower.

* Heads hashing of /boot content slower

* Resealing TPM Disk Unlock Key (cpu based derivation of DRK to validate slot + new key from luks DRK into new slot) way slower

Insight is not only IO being slower, but as if CPU speed was limited to be turtle speed as well.

systemd-analyze blame with branch including changes to coreboot version bump:

Thanks for this test, this will be useful testing point for the future. We should focus on testing what we are aiming to release right now, however.

We propose that we both confirm once again that the binary from this commit is "OK" in terms of performance as reported by us here previously: https://github.com/linuxboot/heads/issues/1894#issuecomment-2604624721

Where "OK" means "comparable to the existing UEFI release in the same hardware specification", not "strictly better than previous laptop model in this specific benchmark", especially that devices with different hardware configurations are being compared here.

If confirmed, we propose that we use this commit as a release for Dasharo+heads @tlaurion @wessel-novacustom to not postpone it any longer.

We can continue investigating the performance concerns, such as raised by @wessel-novacustom here https://github.com/linuxboot/heads/issues/1894#issuecomment-2604602279 in individual dasharo issues.

macpijan avatar Jan 22 '25 12:01 macpijan

Ideally, we release heads release from the same (similar) coreboot base as the previous UEFI release.

  • I can see that, but this performance issue is one of the biggest issues the -TU series has.

Ideally, it would be fixed, despite the coreboot base being slightly different in that case.

In case a coreboot patch could be made to fix this issue, I'm interested in getting a link to that patch, so I can assist customers with UEFI firmware who are complaining about this.

My suggestion is to make a v0.9.2-rc1/v1.0.0-rc1 untested UEFI release with just this fix patch. We can then use that version as the base for Heads.

wessel-novacustom avatar Jan 22 '25 13:01 wessel-novacustom

Ideally, it would be fixed, despite the coreboot base being slightly different in that case.

I will move this off-channel to discuss quicker and come back here with conclusions.

macpijan avatar Jan 22 '25 13:01 macpijan

We propose that we both confirm once again that the binary from this commit is "OK" in terms of performance as reported by us here previously: https://github.com/linuxboot/heads/issues/1894#issuecomment-2604624721

  • 28.983s - 155H, 32G RAM, 1TB SSD

mkopec avatar Jan 22 '25 13:01 mkopec

...isn't [email protected] dependent on network connection anyway? So we're no longer comparing only I/O speeds. It also continues starting after I've already logged in, confirmed in systemctl list-jobs. It starts up in the background.

mkopec avatar Jan 22 '25 13:01 mkopec

Here's my kdiskmark result in personal Qube on the same laptop with 155H, 32GB RAM, 1TB SSD, AC in:

Image

all with default settings. Of course with LUKS, LVM and Xen between disk and benchmark this isn't measuring just I/O but it should be more objective.

@tlaurion can you share yours for comparison?

mkopec avatar Jan 22 '25 14:01 mkopec

Here's my kdiskmark result in personal Qube on the same laptop with 155H, 32GB RAM, 1TB SSD, AC in:

Image

all with default settings. Of course with LUKS, LVM and Xen between disk and benchmark this isn't measuring just I/O but it should be more objective.

@tlaurion can you share yours for comparison?

Sorry 64gb, will edit prior posts :( Past v560tu had 96gb and 2tb drive. This one has 64gb and 4tb m2 drive.

PXL_20250122_171917296.jpg

PXL_20250122_165619680.MP.jpg

PXL_20250122_170010896.jpg

tlaurion avatar Jan 22 '25 17:01 tlaurion

Thanks for testing! So you are having significantly worse performance than me... I think next we will test on the same 4TB disk model as in your V560TU

mkopec avatar Jan 22 '25 17:01 mkopec

It's really hard to compare things, intra-group and inter-group.

For example, nv41 doesn't require to be installed with kernel-latest. Also, my nv41 setup is btrfs based with heavy optimizations as opposed to lvm default setup, with no revisions to keep (leaving wyng do the one revision to keep being snapshot corresponding to last backup), discard=async in fstab and bees deployed (/var/lib/qubes being fully deduped), so cannot compare stats between nv41 setup and v560tu directly on my side.

Still, some stats that could be eye opening to consider what needs to be improved next.

nv41 stats

snap: let's not use that. debian-template, stable kernel (6.6.68)

Image

fedora-40-dvm, stable kernel (6.6.68):

Image

fedora-40-dvm, latest kernel (after dom0 sudo qubes-dom0-update kernel-latest kernel-latest-qubes-vm deploying 6.12.9, rebooting and launching kdiskmark)

Image

Nonetheless to say: kernel-latest is a necessity but not necessarily a good news for performance I will personally revert back to non-latest kernel on nv41, because I can, as opposed to v560tu which requires usage of kernel-latest.

CC: @macpijan @mkopec @marmarek flag raised in qubes-public matrix channel

tlaurion avatar Jan 22 '25 18:01 tlaurion

We should mark this as a known issue for the Dasharo corboot+Heads release. @macpijan @tlaurion

wessel-novacustom avatar Jan 22 '25 21:01 wessel-novacustom

We should mark this as a known issue for the Dasharo corboot+Heads release. @macpijan @tlaurion

How come is this a Heads issue? As opposed to Dasharo-UEFI (or simply coreboot), m2 drive or qubesos latest kernel? I'm not sure I follow.

As depicted in 3rd picture at https://github.com/linuxboot/heads/issues/1894#issuecomment-2607968020

Switching nv41 to using qubesos latest kernel showed similar perf issues of v560tu, which requires installation and usage of kernel-latest at install, as opposed to nv41. No? Then there is 4tb drive which has no SDRAM, offered optional, as compared to tests of @mkopec. Sorry, but this cannot be Heads specific, nor Heads fault.

I'm ok putting this as known issues if sub-issues are opened and referred to in downstram releases. This issue will stay open and reffered in sub-created issues.

From gut feeling here, it's either important perf issue caused by latest kernel on which v560tu depends, the 4tb drive lacking sdram or coreboot doing something funky, but Heads has nothing to do with what is observed here. Heads job is long done and irrelevant to observed issue. Unless the coreboot used commit between the two version is different and at cause for same HCL.

tlaurion avatar Jan 23 '25 04:01 tlaurion

It's not specifically a Heads issue, @tlaurion. I just mean that we won't further investigate this for the Dasharo version that @macpijan is about to release.

wessel-novacustom avatar Jan 23 '25 06:01 wessel-novacustom

It's not specifically a Heads issue, @tlaurion. I just mean that we won't further investigate this for the Dasharo version that @macpijan is about to release.

So I guess what you mean is that the same notes should be present for both Dasharo-UEFI and Dasharo-Heads, since it's not Heads specific and if it is, it's because of coreboot commit difference.

tlaurion avatar Jan 23 '25 07:01 tlaurion

Issue has been raised separately for UEFI: https://github.com/Dasharo/dasharo-issues/issues/1216

wessel-novacustom avatar Jan 23 '25 07:01 wessel-novacustom

@tlaurion do you have any CLI version of a benchmark that shows this issue? Something that would produce a text output that I can then parse with a script, compare with plain diff etc. I would like to add disk performance test to our CI, but manually comparing graphical screenshots is not going to fly.

marmarek avatar Jan 25 '25 20:01 marmarek

Maybe some specific configs to the fio tool?

marmarek avatar Jan 25 '25 20:01 marmarek

@tlaurion do you have any CLI version of a benchmark that shows this issue? Something that would produce a text output that I can then parse with a script, compare with plain diff etc. I would like to add disk performance test to our CI, but manually comparing graphical screenshots is not going to fly.

I don't have insights on this matter, unfortunately.

tlaurion avatar Jan 25 '25 21:01 tlaurion

Maybe some specific configs to the fio tool?

@marmarek any command line command you propose I can reuse. kdisktats is what is used by end users on forum to show things visually. Replicating that kdisktats does, from a command line perspective, coukd help here, yes.

tlaurion avatar Jan 25 '25 21:01 tlaurion