DeepSea
DeepSea copied to clipboard
ceph.stage.discovery fails if disk doesn't have entry in /dev
Description of Issue/Question
During discovery phase, proposal. populate fails for just one node.
proposal.generate:
nodea:
The minion function caused an exception: Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1455, in _thread_return
return_data = executor.execute()
File "/usr/lib/python2.7/site-packages/salt/executors/direct_call.py", line 28, in execute
return self.func(*self.args, **self.kwargs)
File "/var/cache/salt/minion/extmods/modules/proposal.py", line 262, in generate
disks = cephdisks.list_(**kwargs)
File "/var/cache/salt/minion/extmods/modules/cephdisks.py", line 480, in list_
return hwd.assemble_device_list()
File "/var/cache/salt/minion/extmods/modules/cephdisks.py", line 472, in assemble_device_list
self._preflight_check(hardware)
File "/var/cache/salt/minion/extmods/modules/cephdisks.py", line 420, in _preflight_check
raise ValueError("{} is not included in the hardware dict.".format(rf))
ValueError: Capacity is not included in the hardware dict.
When debugging cephdisks.py
we saw the following:
2018-12-10 12:56:38,947 No partitions detected on sdd
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/qxn2533/cephdisks.py", line 478, in assemble_device_list
self._preflight_check(hardware)
File "/home/qxn2533/cephdisks.py", line 426, in _preflight_check
raise ValueError("{} is not included in the hardware dict.".format(rf))
ValueError: Capacity is not included in the hardware dict.
Culprit seems to be in assemble_device_list
and _query_disktype
. In the first a glob is used on /sys/block/*/device
. In _query_disktype
smartctl
is used to to get disk info. In our case, the device existed in /sys/block/*
, but because the disk was partially in a deceased state, it didn't exist in /dev
. Therefore, smartctl
was run with a none existing device name for which it conveniently returns return code 0.
Developers already anticipated on this, but the actual parsing code was not added yet in cephdisks.py
.
for line in proc.stdout:
# ADD PARSING HERE TO DETECT FAILURE
Versions Report
SLES12SP2 deepsea-0.8.5-2.16.1.noarch salt-2016.11.4-46.20.2.x86_64 ses-release-POOL-5-1.54.x86_64 salt-minion-2016.11.4-46.20.2.x86_64 salt-api-2016.11.4-46.20.2.x86_64 ses-release-5-58.1.x86_64 salt-master-2016.11.4-46.20.2.x86_64