DeepSea
DeepSea copied to clipboard
Custom OSD configs break Stage 3
On DeepSea master branch, if I use the OSD configuration from qa/osd-config/ovh/bs_dedicated_db.yaml
Stage 3 fails like this:
(steps 1-27 succeed)
2018-07-27T14:33:23.495 INFO:teuthology.orchestra.run.target217182143060.stdout:[28/29] Executing state ceph.processes.osd...
2018-07-27T14:33:23.495 INFO:teuthology.orchestra.run.target217182143060.stdout: |_ target217182143060.teuthology: wait for osd processes(cephprocesses.wait) fail
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout: target217182143060.teuthology... fail
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout:Finished stage ceph.stage.3: succeeded=27/29 failed=1/29
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout:
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout:Failures summary:
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout:ceph.processes.osd (/srv/salt/ceph/processes/osd):
2018-07-27T14:33:23.496 INFO:teuthology.orchestra.run.target217182143060.stdout: target217182143060.teuthology:
2018-07-27T14:33:23.497 INFO:teuthology.orchestra.run.target217182143060.stdout: wait for osd processes: Module function cephprocesses.wait executed
2018-07-27T14:33:23.588 INFO:teuthology.orchestra.run.target217182143060.stdout:deepsea exit status: 1
Note: this is a single-node cluster in OVH with root (/dev/vda) + 4 (/dev/vd[bcde]) disk drives.
Relevant excerpt from pillar data:
2018-07-27T14:27:08.593 INFO:teuthology.orchestra.run.target217182143060.stdout: ceph:
2018-07-27T14:27:08.593 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.593 INFO:teuthology.orchestra.run.target217182143060.stdout: storage:
2018-07-27T14:27:08.593 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.593 INFO:teuthology.orchestra.run.target217182143060.stdout: osds:
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vdb:
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: db:
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vde
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: format:
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: bluestore
2018-07-27T14:27:08.594 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vdc:
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: db:
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vde
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: format:
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: bluestore
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vdd:
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: ----------
2018-07-27T14:27:08.595 INFO:teuthology.orchestra.run.target217182143060.stdout: db:
2018-07-27T14:27:08.596 INFO:teuthology.orchestra.run.target217182143060.stdout: /dev/vde
2018-07-27T14:27:08.596 INFO:teuthology.orchestra.run.target217182143060.stdout: format:
2018-07-27T14:27:08.596 INFO:teuthology.orchestra.run.target217182143060.stdout: bluestore
Logs: http://137.74.25.20/ubuntu-2018-07-27_14:11:42-suse:smoke:health-ok-wip-qa-osd-config---basic-openstack/91/
It appears to be reproducible. Here's the next run: http://137.74.25.20:8081/ubuntu-2018-07-27_14:21:18-suse:smoke:health-ok-wip-qa-osd-config---basic-openstack/
(the green one is using default profile)
A relevant failure ( from the minion logs )
2018-07-27 14:30:10,067 [salt.loaded.ext.module.helper:64 ][DEBUG ][41364] stderr of PYTHONWARNINGS=ignore ceph-disk -v prepare --bluestore --data-dev --journal-dev --cluster ceph --cluste
r-uuid 1efc1033-0565-4586-9f79-36906a01bbb0 --block.db /dev/vde /dev/vdd:
[-- snip -- ]
'command: Running command: /usr/sbin/parted --machine -- /dev/vde print\n'
'get_free_partition_index: get_free_partition_index: analyzing \r'
' \r'
'BYT;\n'
'/dev/vde:10.7GB:virtblk:512:512:unknown:Virtio Block Device:;\n'
'\n'
'Traceback (most recent call last):\n'
' File "/usr/sbin/ceph-disk", line 11, in <module>\n'
" load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()\n"
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 5753, in '
'run\n'
' main(sys.argv[1:])\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 5704, in '
'main\n'
' args.func(args)\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2121, in '
'main\n'
' Prepare.factory(args).prepare()\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2110, in '
'prepare\n'
' self._prepare()\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2191, in '
'_prepare\n'
' self.data.prepare(*to_prepare_list)\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2965, in '
'prepare\n'
' self.prepare_device(*to_prepare_list)\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 3142, in '
'prepare_device\n'
' to_prepare.prepare()\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2332, in '
'prepare\n'
' self.prepare_device()\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 2424, in '
'prepare_device\n'
' num=num)\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 1778, in '
'create_partition\n'
' num = get_free_partition_index(dev=self.path)\n'
' File "/usr/lib/python3.6/site-packages/ceph_disk/main.py", line 1535, in '
'get_free_partition_index\n'
' return max(partition_numbers) + 1\n'
'ValueError: max() arg is an empty sequence')
lsblk before Stage 3:
2018-07-27T15:30:14.593 INFO:teuthology.orchestra.run.target149202164067.stdout: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
2018-07-27T15:30:14.594 INFO:teuthology.orchestra.run.target149202164067.stdout: vda 254:0 0 40G 0 disk
2018-07-27T15:30:14.594 INFO:teuthology.orchestra.run.target149202164067.stdout: |-vda1 254:1 0 2M 0 part
2018-07-27T15:30:14.594 INFO:teuthology.orchestra.run.target149202164067.stdout: |-vda2 254:2 0 20M 0 part /boot/efi
2018-07-27T15:30:14.594 INFO:teuthology.orchestra.run.target149202164067.stdout: `-vda3 254:3 0 40G 0 part /
2018-07-27T15:30:14.595 INFO:teuthology.orchestra.run.target149202164067.stdout: vdb 254:16 0 10G 0 disk
2018-07-27T15:30:14.595 INFO:teuthology.orchestra.run.target149202164067.stdout: vdc 254:32 0 10G 0 disk
2018-07-27T15:30:14.595 INFO:teuthology.orchestra.run.target149202164067.stdout: vdd 254:48 0 10G 0 disk
2018-07-27T15:30:14.595 INFO:teuthology.orchestra.run.target149202164067.stdout: vde 254:64 0 10G 0 disk
and lsblk, ceph-disk list, ceph osd tree after Stage 3:
2018-07-27T16:57:01.599 INFO:teuthology.orchestra.run.target149202171021.stdout:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
2018-07-27T16:57:01.599 INFO:teuthology.orchestra.run.target149202171021.stdout:vda 254:0 0 40G 0 disk
2018-07-27T16:57:01.599 INFO:teuthology.orchestra.run.target149202171021.stdout:├─vda1 254:1 0 2M 0 part
2018-07-27T16:57:01.599 INFO:teuthology.orchestra.run.target149202171021.stdout:├─vda2 254:2 0 20M 0 part /boot/efi
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:└─vda3 254:3 0 40G 0 part /
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:vdb 254:16 0 10G 0 disk
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:└─vdb1 254:17 0 100M 0 part
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:vdc 254:32 0 10G 0 disk
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:└─vdc1 254:33 0 100M 0 part
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:vdd 254:48 0 10G 0 disk
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:└─vdd1 254:49 0 100M 0 part
2018-07-27T16:57:01.600 INFO:teuthology.orchestra.run.target149202171021.stdout:vde 254:64 0 10G 0 disk
2018-07-27T16:57:01.601 INFO:teuthology.orchestra.run.target149202171021.stderr:+ set -e
2018-07-27T16:57:01.601 INFO:teuthology.orchestra.run.target149202171021.stderr:+ lsblk
2018-07-27T16:57:01.601 INFO:teuthology.orchestra.run.target149202171021.stderr:+ ceph-disk list
2018-07-27T16:57:02.685 INFO:teuthology.orchestra.run.target149202171021.stdout:/dev/vda :
2018-07-27T16:57:02.685 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vda1 other, 21686148-6449-6e6f-744e-656564454649
2018-07-27T16:57:02.685 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vda2 other, vfat
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vda3 other, xfs
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout:/dev/vdb :
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vdb1 other, 89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout:/dev/vdc :
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vdc1 other, 89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout:/dev/vdd :
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout: /dev/vdd1 other, 89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be
2018-07-27T16:57:02.686 INFO:teuthology.orchestra.run.target149202171021.stdout:/dev/vde other, unknown
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr:/usr/lib/python3.6/site-packages/ceph_disk/main.py:5684: UserWarning:
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr:ceph-disk is deprecated and will be removed in the next major version
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr:of Ceph (Nautilus). The replacement tool is called ceph-volume. For
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr:details see http://docs.ceph.com/docs/master/ceph-volume/#migrating
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr:
2018-07-27T16:57:02.687 INFO:teuthology.orchestra.run.target149202171021.stderr: warnings.warn(DEPRECATION_WARNING)
2018-07-27T16:57:02.722 INFO:teuthology.orchestra.run.target149202171021.stderr:+ ceph osd tree
2018-07-27T16:57:03.084 INFO:teuthology.orchestra.run.target149202171021.stdout:ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
2018-07-27T16:57:03.084 INFO:teuthology.orchestra.run.target149202171021.stdout:-1 0 root default
Hrm, bs_dedicated_wal_crypt.yaml
fails in the same way.
after Stage 1:
2018-07-27T18:18:46.672 INFO:teuthology.orchestra.run.target149202181003.stdout:Your randomly chosen storage profile bs_dedicated_wal_crypt.yaml has the following contents:
2018-07-27T18:18:46.672 INFO:teuthology.orchestra.run.target149202181003.stderr:+ echo 'Your randomly chosen storage profile bs_dedicated_wal_crypt.yaml has the following contents:'
2018-07-27T18:18:46.672 INFO:teuthology.orchestra.run.target149202181003.stderr:+ cat /srv/pillar/ceph/proposals/profile-random/stack/default/ceph/minions/target149202181003.teuthology.yml
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout:ceph:
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: storage:
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: osds:
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdb:
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: format: bluestore
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: encryption: dmcrypt
2018-07-27T18:18:46.673 INFO:teuthology.orchestra.run.target149202181003.stdout: wal: /dev/vde
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdc:
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: format: bluestore
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: encryption: dmcrypt
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: wal: /dev/vde
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdd:
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: format: bluestore
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: encryption: dmcrypt
2018-07-27T18:18:46.674 INFO:teuthology.orchestra.run.target149202181003.stdout: wal: /dev/vde
after Stage 3:
2018-07-27T18:27:52.638 INFO:teuthology.orchestra.run.target149202181003.stdout:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:vda 254:0 0 40G 0 disk
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:├─vda1 254:1 0 2M 0 part
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:├─vda2 254:2 0 20M 0 part /boot/efi
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:└─vda3 254:3 0 40G 0 part /
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:vdb 254:16 0 10G 0 disk
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:└─vdb5 254:21 0 10M 0 part /var/lib/ceph/osd-lockbox/48acfb33-4116-43ad-8d96-0cb101c477ad
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:vdc 254:32 0 10G 0 disk
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:└─vdc5 254:37 0 10M 0 part /var/lib/ceph/osd-lockbox/28bedee2-e765-407e-a250-fa93563f776b
2018-07-27T18:27:52.639 INFO:teuthology.orchestra.run.target149202181003.stdout:vdd 254:48 0 10G 0 disk
2018-07-27T18:27:52.640 INFO:teuthology.orchestra.run.target149202181003.stdout:└─vdd5 254:53 0 10M 0 part /var/lib/ceph/osd-lockbox/9ec2c4a3-c06b-46ed-b3d0-0043fcb7c9be
2018-07-27T18:27:52.640 INFO:teuthology.orchestra.run.target149202181003.stdout:vde 254:64 0 10G 0 disk
2018-07-27T18:27:52.640 INFO:teuthology.orchestra.run.target149202181003.stderr:+ set -e
2018-07-27T18:27:52.640 INFO:teuthology.orchestra.run.target149202181003.stderr:+ lsblk
2018-07-27T18:27:52.640 INFO:teuthology.orchestra.run.target149202181003.stderr:+ ceph-disk list
2018-07-27T18:27:53.034 INFO:teuthology.orchestra.run.target149202181003.stdout:/dev/vda :
2018-07-27T18:27:53.035 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vda1 other, 21686148-6449-6e6f-744e-656564454649
2018-07-27T18:27:53.035 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vda2 other, vfat
2018-07-27T18:27:53.035 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vda3 other, xfs
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout:/dev/vdb :
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdb5 other, ext4
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout:/dev/vdc :
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdc5 other, ext4
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout:/dev/vdd :
2018-07-27T18:27:53.036 INFO:teuthology.orchestra.run.target149202181003.stdout: /dev/vdd5 other, ext4
2018-07-27T18:27:53.037 INFO:teuthology.orchestra.run.target149202181003.stdout:/dev/vde other, unknown
2018-07-27T18:27:53.037 INFO:teuthology.orchestra.run.target149202181003.stderr:/usr/lib/python3.6/site-packages/ceph_disk/main.py:5684: UserWarning:
2018-07-27T18:27:53.037 INFO:teuthology.orchestra.run.target149202181003.stderr:ceph-disk is deprecated and will be removed in the next major version
2018-07-27T18:27:53.037 INFO:teuthology.orchestra.run.target149202181003.stderr:of Ceph (Nautilus). The replacement tool is called ceph-volume. For
2018-07-27T18:27:53.037 INFO:teuthology.orchestra.run.target149202181003.stderr:details see http://docs.ceph.com/docs/master/ceph-volume/#migrating
2018-07-27T18:27:53.038 INFO:teuthology.orchestra.run.target149202181003.stderr:
2018-07-27T18:27:53.038 INFO:teuthology.orchestra.run.target149202181003.stderr: warnings.warn(DEPRECATION_WARNING)
2018-07-27T18:27:53.068 INFO:teuthology.orchestra.run.target149202181003.stderr:+ ceph osd tree
2018-07-27T18:27:53.422 INFO:teuthology.orchestra.run.target149202181003.stdout:ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
2018-07-27T18:27:53.422 INFO:teuthology.orchestra.run.target149202181003.stdout:-1 0 root default
Logs: http://137.74.25.20:8081/ubuntu-2018-07-27_18:05:26-suse:smoke:health-ok-wip-qa-osd-config---basic-openstack/102/
I have not tried all 20 profiles in qa/osd-config/ovh
, but so far I've run around 5-10 of them at random and they all exhibit this same failure.
I have not tried all 20 profiles in qa/osd-config/ovh, but so far I've run around 5-10 of them at random and they all exhibit this same failure.
Now, we need a way to confirm which configurations cause problems and work on a solution. We actually need a way to sequentially run all configurations.
@jschmid1 Can't run them sequentially, but I can run them in parallel :-)
That's even better @smithfarm :)