rook icon indicating copy to clipboard operation
rook copied to clipboard

Problem with OSD Prepare

Open pomland-94 opened this issue 5 months ago • 9 comments

When I try to install Rook Ceph as described in the QuickStart Guide, I get an error when it prepares the OSDs. All config files (operator.yaml, crd.yaml, common.yaml) were not modified.

I use Kubernetes 1.30.4 on Debian 12 (ARM64), this are the Pod logs from one of the old-prepare Pods:

[2024-09-21 20:33:50,685][ceph_volume.util.disk][INFO  ] opening device /dev/sdb to check for BlueStore label
[2024-09-21 20:33:50,686][ceph_volume.process][INFO  ] Running command: /usr/sbin/udevadm info --query=property /dev/sdb
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout DEVPATH=/devices/pci0000:00/0000:00:02.5/0000:06:00.0/virtio4/host0/target0:0:0/0:0:0:2/block/sdb
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout DEVNAME=/dev/sdb
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout DEVTYPE=disk
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout DISKSEQ=12
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout MAJOR=8
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout MINOR=16
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout SUBSYSTEM=block
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout USEC_INITIALIZED=6312815107
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_SCSI=1
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_VENDOR=HC
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_VENDOR_ENC=HC\x20\x20\x20\x20\x20\x20
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_MODEL=Volume
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_MODEL_ENC=Volume\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_REVISION=2.5+
[2024-09-21 20:33:50,696][ceph_volume.process][INFO  ] stdout ID_TYPE=disk
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_SERIAL=0HC_Volume_101330090
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_SERIAL_SHORT=101330090
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_SCSI_SERIAL=101330090
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_BUS=scsi
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_PATH=pci-0000:06:00.0-scsi-0:0:0:2
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout ID_PATH_TAG=pci-0000_06_00_0-scsi-0_0_0_2
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout DEVLINKS=/dev/disk/by-id/scsi-0HC_Volume_101330090 /dev/disk/by-path/pci-0000:06:00.0-scsi-0:0:0:2 /dev/disk/by-diskseq/12
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout TAGS=:systemd:
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] stdout CURRENT_TAGS=:systemd:
[2024-09-21 20:33:50,697][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-bluestore-tool show-label --dev /dev/sdb
[2024-09-21 20:33:50,729][ceph_volume.process][INFO  ] stderr unable to read label for /dev/sdb: (2) No such file or directory
[2024-09-21 20:33:50,730][ceph_volume.process][INFO  ] stderr 2024-09-21T20:33:50.721+0000 ffffaa316040 -1 bluestore(/dev/sdb) _read_bdev_label unable to decode label at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
[2024-09-21 20:33:50,730][ceph_volume.process][INFO  ] Running command: /usr/sbin/blkid -c /dev/null -p /dev/sdb
[2024-09-21 20:33:50,755][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-authtool --gen-print-key
[2024-09-21 20:33:50,778][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 8c3272ec-5983-4709-bcc7-69b83fa1bbc0
[2024-09-21 20:33:51,198][ceph_volume.process][INFO  ] stdout 0
[2024-09-21 20:33:51,198][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-authtool --gen-print-key
[2024-09-21 20:33:51,229][ceph_volume.process][INFO  ] Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
[2024-09-21 20:33:51,233][ceph_volume.util.system][INFO  ] CEPH_VOLUME_SKIP_RESTORECON environ is set, will not call restorecon
[2024-09-21 20:33:51,234][ceph_volume.process][INFO  ] Running command: /usr/bin/chown -R ceph:ceph /dev/sdb
[2024-09-21 20:33:51,239][ceph_volume.process][INFO  ] Running command: /usr/bin/ln -s /dev/sdb /var/lib/ceph/osd/ceph-0/block
[2024-09-21 20:33:51,244][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
[2024-09-21 20:33:51,678][ceph_volume.process][INFO  ] stderr got monmap epoch 3
[2024-09-21 20:33:51,705][ceph_volume.util.prepare][INFO  ] Creating keyring file for osd.0
[2024-09-21 20:33:51,705][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-0/keyring --create-keyring --name osd.0 --add-key AQCuLe9mQ6QxLhAA4cEJJ6+xclcEmE0vMc3TGA==
[2024-09-21 20:33:51,742][ceph_volume.process][INFO  ] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
[2024-09-21 20:33:51,747][ceph_volume.process][INFO  ] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
[2024-09-21 20:33:51,751][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 8c3272ec-5983-4709-bcc7-69b83fa1bbc0 --setuser ceph --setgroup ceph
[2024-09-21 20:33:51,785][ceph_volume.devices.raw.prepare][ERROR ] raw prepare was unable to complete
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 80, in safe_prepare
    self.prepare()
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 118, in prepare
    prepare_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 53, in prepare_bluestore
    prepare_utils.osd_mkfs_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/util/prepare.py", line 459, in osd_mkfs_bluestore
    raise RuntimeError('Command failed with exit code %s: %s' % (returncode, ' '.join(command)))
RuntimeError: Command failed with exit code -11: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 8c3272ec-5983-4709-bcc7-69b83fa1bbc0 --setuser ceph --setgroup ceph
[2024-09-21 20:33:51,786][ceph_volume.devices.raw.prepare][INFO  ] will rollback OSD ID creation
[2024-09-21 20:33:51,787][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
[2024-09-21 20:33:52,173][ceph_volume.process][INFO  ] stderr purged osd.0
[2024-09-21 20:33:52,199][ceph_volume.process][INFO  ] Running command: /usr/bin/systemctl is-active ceph-osd@0
[2024-09-21 20:33:52,209][ceph_volume.process][INFO  ] stderr System has not been booted with systemd as init system (PID 1). Can't operate.
[2024-09-21 20:33:52,209][ceph_volume.process][INFO  ] stderr Failed to connect to bus: Host is down
[2024-09-21 20:33:52,214][ceph_volume.util.system][INFO  ] Executable lvs found on the host, will use /sbin/lvs
[2024-09-21 20:33:52,214][ceph_volume.process][INFO  ] Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S tags={ceph.osd_id=0} -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2024-09-21 20:33:52,289][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 80, in safe_prepare
    self.prepare()
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 118, in prepare
    prepare_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 53, in prepare_bluestore
    prepare_utils.osd_mkfs_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/util/prepare.py", line 459, in osd_mkfs_bluestore
    raise RuntimeError('Command failed with exit code %s: %s' % (returncode, ' '.join(command)))
RuntimeError: Command failed with exit code -11: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 8c3272ec-5983-4709-bcc7-69b83fa1bbc0 --setuser ceph --setgroup ceph

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/main.py", line 153, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python3.9/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/main.py", line 32, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python3.9/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 160, in main
    self.safe_prepare(self.args)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 84, in safe_prepare
    rollback_osd(self.args, self.osd_id)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/common.py", line 35, in rollback_osd
    Zap(['--destroy', '--osd-id', osd_id]).main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 407, in main
    self.zap_osd()
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 305, in zap_osd
    devices = find_associated_devices(self.args.osd_id, self.args.osd_fsid)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 88, in find_associated_devices
    raise RuntimeError('Unable to find any LV for zapping OSD: '
RuntimeError: Unable to find any LV for zapping OSD: 0
2024-09-21 20:33:52.371758 C | rookcmd: failed to configure devices: failed to initialize osd: failed to run ceph-volume raw command. Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 8c3272ec-5983-4709-bcc7-69b83fa1bbc0
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/chown -R ceph:ceph /dev/sdb
Running command: /usr/bin/ln -s /dev/sdb /var/lib/ceph/osd/ceph-0/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
 stderr: got monmap epoch 3
--> Creating keyring file for osd.0
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 8c3272ec-5983-4709-bcc7-69b83fa1bbc0 --setuser ceph --setgroup ceph
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
 stderr: purged osd.0
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 80, in safe_prepare
    self.prepare()
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 118, in prepare
    prepare_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 53, in prepare_bluestore
    prepare_utils.osd_mkfs_bluestore(
  File "/usr/lib/python3.9/site-packages/ceph_volume/util/prepare.py", line 459, in osd_mkfs_bluestore
    raise RuntimeError('Command failed with exit code %s: %s' % (returncode, ' '.join(command)))
RuntimeError: Command failed with exit code -11: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 8c3272ec-5983-4709-bcc7-69b83fa1bbc0 --setuser ceph --setgroup ceph

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/sbin/ceph-volume", line 33, in <module>
    sys.exit(load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')())
  File "/usr/lib/python3.9/site-packages/ceph_volume/main.py", line 41, in __init__
    self.main(self.argv)
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/main.py", line 153, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python3.9/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/main.py", line 32, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python3.9/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 160, in main
    self.safe_prepare(self.args)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/raw/prepare.py", line 84, in safe_prepare
    rollback_osd(self.args, self.osd_id)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/common.py", line 35, in rollback_osd
    Zap(['--destroy', '--osd-id', osd_id]).main()
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 407, in main
    self.zap_osd()
  File "/usr/lib/python3.9/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 305, in zap_osd
    devices = find_associated_devices(self.args.osd_id, self.args.osd_fsid)
  File "/usr/lib/python3.9/site-packages/ceph_volume/devices/lvm/zap.py", line 88, in find_associated_devices
    raise RuntimeError('Unable to find any LV for zapping OSD: '
RuntimeError: Unable to find any LV for zapping OSD: 0: exit status 1

pomland-94 avatar Sep 22 '24 10:09 pomland-94