zos icon indicating copy to clipboard operation
zos copied to clipboard

ZOS detected one of the disks as hdd instead of ssd

Open sabrinasadik opened this issue 2 years ago • 8 comments

Node 4127, see https://github.com/threefoldtech/tf_support/issues/356

sabrinasadik avatar Aug 11 '22 08:08 sabrinasadik

Was the node updated? by the time of writing this comment the node still has ~4T of SSD and ~1T of HDD. Note that zos only detect disk speed on boot onetime. It doesn't rely on what the disk report as a type instead it does speed tests. It's possible that the disk is too slow or of bad quality even if it's an SSD that it will be considered as HDD

muhamadazmy avatar Aug 11 '22 10:08 muhamadazmy

2nd user with similar issue.

Farm ID 138 - Clubbing bear

Node 2293 had 0HDD and 2.48TB SSD. After reboot he suddenly got a different node id (4296) with 2TB HDD and 480.07 GB SSD.

sabrinasadik avatar Aug 12 '22 09:08 sabrinasadik

We asked the user of farm 138 to reboot which he did. Now Node 2293 is back online, with the right amount of HDD and SSD. Very strange behavior. Please investigate.

sabrinasadik avatar Aug 12 '22 12:08 sabrinasadik

@sabrinasadik i will.

What i think might have happened is that

  • nodes was running already before the storage fix (to only detect disks on boot)
  • then at some point the bad detection happened
  • then the node got the fix, and now it stuck was latest speed tests values.
  • after rebooting, the node re-detected the speed correctly, and now it's persisted and should remain.

I will still investigate what might have went wrong

muhamadazmy avatar Aug 16 '22 07:08 muhamadazmy

Any updates?

sabrinasadik avatar Sep 02 '22 11:09 sabrinasadik

Anecdotally, I see reports of this from time to time. They always clear with a reboot and don't seem to recur. I'll document further cases here if I see them.

scottyeager avatar Sep 28 '22 19:09 scottyeager

Here's a new case, node id 3755 on mainnet. This one is not clearing with reboot. We can see the the farmer minted for last month with the properly detected specs:

image

But now part of the SSD is showing as HDD in the explorer (verified this against grid proxy too):

image

It sounds like this happened while the node was online. Double checking that with the farmer and will update here.

scottyeager avatar Oct 15 '22 00:10 scottyeager

Yes @scottyeager - it appears to have occurred while the node was online.

Quoting the user: "Yes it happened when it was powered on because that's when I restarted it thinking that would fix the issue, I hope it's not because of the 2.5 SSDs because all my other node have m.2 NVMe and those don't have any issue"

TullysInc avatar Oct 20 '22 18:10 TullysInc