container-storage-setup
container-storage-setup copied to clipboard
docker storage limited to 2TB by sfdisk
I tried to use docker-storage-setup on a disk that was larger than 2TB:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 558.4G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 557.4G 0 part
├─rhel7-root 253:0 0 556.3G 0 lvm /
└─rhel7-swap 253:1 0 1G 0 lvm [SWAP]
sdb 8:16 0 2.2T 0 disk
root@bkr-hv02: ~ # systemctl start docker-storage-setup
Job for docker-storage-setup.service failed because the control process exited with error code. See "systemctl status docker-storage-setup.service" and "journalctl -xe" for details.
It failed to start and threw an error that comes from the sfdisk utility used to create partitions.
root@bkr-hv02: ~ # systemctl status docker-storage-setup -l
● docker-storage-setup.service - Docker Storage Setup
Loaded: loaded (/usr/lib/systemd/system/docker-storage-setup.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2016-04-06 11:36:58 EDT; 7s ago
Process: 61143 ExecStart=/usr/bin/docker-storage-setup (code=exited, status=1/FAILURE)
Main PID: 61143 (code=exited, status=1/FAILURE)
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: /dev/sdb4 0 - 0 0 Empty
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: Warning: partition 1 has size 2.4 TB (2398201315328 bytes),
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: which is larger than the 2199023255040 bytes limit imposed
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: by the DOS partition table for 512-byte sectors
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: sfdisk: I don't like these partitions - nothing changed.
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61143]: (If you really want this, use the --force option.)
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com systemd[1]: docker-storage-setup.service: main process exited, code=exited, status=1/FAILURE
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com systemd[1]: Failed to start Docker Storage Setup.
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com systemd[1]: Unit docker-storage-setup.service entered failed state.
Apr 06 11:36:58 bkr-hv02.lab.eng.rdu.redhat.com systemd[1]: docker-storage-setup.service failed.
As suggested in the error, I added --force to /usr/bin/docker-storage-setup:
# diff -pruN docker-storage-setup /usr/bin/docker-storage-setup
--- docker-storage-setup 2016-04-06 11:41:44.519253366 -0400
+++ /usr/bin/docker-storage-setup 2016-04-06 11:41:49.431230315 -0400
@@ -568,7 +568,7 @@ create_disk_partitions() {
# * Error handling when partition(s) already exist
# * Deal with loop/nbd device names. See growpart code
size=$(( $( awk "\$4 ~ /"$( basename $dev )"/ { print \$3 }" /proc/partitions ) * 2 - 2048 ))
- cat <<EOF | sfdisk $dev
+ cat <<EOF | sfdisk --force $dev
unit: sectors
${dev}1 : start= 2048, size= ${size}, Id=8e
And then it was able to create the partition:
# systemctl status docker-storage-setup -l
● docker-storage-setup.service - Docker Storage Setup
Loaded: loaded (/usr/lib/systemd/system/docker-storage-setup.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Apr 06 11:42:07 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Volume group "docker_vg" successfully created
Apr 06 11:42:07 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Rounding up size to full physical extent 192.00 MiB
Apr 06 11:42:07 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Wiping xfs signature on /dev/docker_vg/docker-poolmeta.
Apr 06 11:42:07 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Logical volume "docker-poolmeta" created.
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Logical volume "docker-pool" created.
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: WARNING: Converting logical volume docker_vg/docker-pool and docker_vg/docker-poolmeta to pool's data and metadata volumes.
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Converted docker_vg/docker-pool to thin pool.
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[61555]: Logical volume "docker-pool" changed.
Apr 06 11:42:08 bkr-hv02.lab.eng.rdu.redhat.com systemd[1]: Started Docker Storage Setup.
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 558.4G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 557.4G 0 part
├─rhel7-root 253:0 0 556.3G 0 lvm /
└─rhel7-swap 253:1 0 1G 0 lvm [SWAP]
sdb 8:16 0 2.2T 0 disk
└─sdb1 8:17 0 185.5G 0 part
├─docker_vg-docker--pool_tmeta 253:2 0 192M 0 lvm
│ └─docker_vg-docker--pool 253:4 0 74.1G 0 lvm
└─docker_vg-docker--pool_tdata 253:3 0 74.1G 0 lvm
└─docker_vg-docker--pool 253:4 0 74.1G 0 lvm
If we defaulted to --force, would this cause issues? IE other badly confined systems errors being ignored?
I have no idea what issues will be caused if we use --force. I am wondering should we switch to "parted" instead of sfdisk or use a different type of partition table or something else which allows partitions bigger than 2TB.
So this seems to come from MBR as there maximum partition size can be 2TB. (for 512 byte sector). Should we consider using GPT. I am not sure if there are any issues with usage of GPT.
--force seems to have me hitting this:
Apr 06 15:17:22 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[3708]: INFO: Waiting for device /dev/mapper/docker_vg-docker--pool to be available. Wait time remaining is 60 seconds
Apr 06 15:17:27 bkr-hv02.lab.eng.rdu.redhat.com docker-storage-setup[3708]: INFO: Waiting for device /dev/mapper/docker_vg-docker--pool to be available. Wait time remaining is 55 seconds
The partition, pv, and vg look OK, but the lv never shows up and so it sits there waiting until it times out (60 seconds).
I can't think of a reason not to use GPT.
Related to that, one thing that might be interesting is to allocate a GUID in https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/ - basically "if the GUID is X, automatically format it as a PV and add it to the VG backing / ? Might make it slightly easier for admins to pre-provision disks.
@cgwalters Why do we need to partition the disk at all? Why can't we add it directly to volume group.
As for why we use partitions at all...I suspect it was done for the cloud case where we have one disk that gets magically expanded.
I also can't think of a reason not to skip partitions for raw disks.
Actually even with raw disks, we might add it to root volume group (if VG= was not specified) and over next reboot we might have to grow that partition using growpart.
Right now we are assuming that every pv in root volume group is partitioned and I guess that's the reason we are partitioning disks before we add them to volume group.
"raw" but non-virtual disks will never grow right? I am not sure we need to support a scenario where secondary virtual disks are magically grown. A virt user can just as easily add a new disk.
It is however critically important to support adding partitions inside the root (first) disk for the IaaS case on first boot.