lxd icon indicating copy to clipboard operation
lxd copied to clipboard

Add lxc config device add disk --type=block for containers

Open kamzar1 opened this issue 2 years ago • 2 comments

Whilst a custom storage volume --type=block can be added to a VM as block device, it can not be attached to a container. Also attaching a custom storage volume to a container as unix-block not possible due to missing source path.

This feature would make the procedure of custom volume handling uniform and consistent throughout instance types and storage volume types.

kamzar1 avatar Mar 17 '22 17:03 kamzar1

This should be easy enough to sort out. This would effectively behave like a unix-block device and so would require path to be set so we know what path to put the block device on.

stgraber avatar Mar 17 '22 18:03 stgraber

Hi, I second this feature because there are some solutions that require or prefer a different file system, so we need to use attach block device into the container. It would be useful to have it as such:

# This also handles the permissions bit if possible
# Folks might need to remember to install relevant file system support packages on hosts and in containers
lxc storage volume create {pool} {volume} --type block --format ext4 size=30GiB 
lxc storage volume attach {pool} {volume} {container} {devicename} {path} 

or at first to be able to just add a block volume created via the lxc command, using the instance config which then opens it up to things like Ansible with the lxd_container module.

For me this is not a big problem because there is a good work-around and little need for block devices in containers.

My query and links the current working method thanks to other topics in the forum: https://discuss.linuxcontainers.org/t/few-questions-about-zvols-on-lxd-manged-zfs-pools/14606

Thanks

markrattray avatar Jul 15 '22 10:07 markrattray

Thanks!

tomponline avatar Feb 24 '23 13:02 tomponline

Removing this restriction is what this issue is about. The restriction should be lifted and the disk be attached to the container as a unix block device in /dev/

stgraber avatar Apr 14 '23 05:04 stgraber

With the fix for this issue applied, I should be able to do:

  • lxc storage volume create default foo --type block size=5GiB
  • lxc launch images:ubuntu/22.04 c1
  • lxc config device add c1 foo disk pool=default source=foo path=/dev/sda

And get a /dev/sda device in the container which is that custom block volume called foo.

stgraber avatar Apr 14 '23 05:04 stgraber

Definitely looking forward to this

allanrogerr avatar May 17 '23 01:05 allanrogerr

Out of interest what is your use case for this?

tomponline avatar May 17 '23 02:05 tomponline

Good morning @tomponline

Not everything likes the underlying files system on filesystem volumes or performs better on another file system, so in some circumstances we need to be able to choose and block volume does this for containers.

Docker is a good example with ZFS backed pools and I don't want to turn on blocks at the pool level. ArangoDB doesn't like BTRFS (not that I use BTRFS). A rare occurrence but they are out there.

Although there is a work-around as linked in previous comments, it complicates automation, e.g.: Ansible. On my side, it's not urgent at all because the work-around works and I have no immediate need with Docker Swarm having to reside in VMs unfortunately.

Thanks Mark

markrattray avatar May 17 '23 08:05 markrattray

For ZFS block mode you can use per-volume block mode now by the way, so you can attach application specific custom volumes that are backed by ZFS but use a different filesystem, which can help with Docker so it can use overlayfs2.

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true
lxc config device add c1 mydisk disk pool=myzfs source=vol1 path=/mnt
lxc exec c1 -- findmnt /mnt
TARGET SOURCE                              FSTYPE OPTIONS
/mnt   /dev/zvol/myzfs/custom/default_vol1 ext4   rw,relatime,discard,stripe=2

tomponline avatar May 17 '23 08:05 tomponline

Good morning @tomponline

Not everything likes the underlying files system on filesystem volumes or performs better on another file system, so in some circumstances we need to be able to choose and block volume does this for containers.

Docker is a good example with ZFS backed pools and I don't want to turn on blocks at the pool level. ArangoDB doesn't like BTRFS (not that I use BTRFS). A rare occurrence but they are out there.

Although there is a work-around as linked in previous comments, it complicates automation, e.g.: Ansible. On my side, it's not urgent at all because the work-around works and I have no immediate need with Docker Swarm having to reside in VMs unfortunately.

Thanks Mark

The reason I ask is that attaching block disks to unprivileged containers is still going to be fairly restricted as unprivileged containers don't allow mounting. So if the application is using the block device directly then that is fine, but if the requirement is for attaching a block disk and then mounting it inside the container then you may be disapointed.

tomponline avatar May 17 '23 08:05 tomponline

Re: ZFS Block Mode Thanks for that info, I'm a bit behind. In your example, how did it know to format it EXT4? I've done a quick search in documentation for this... I can only see some volume.block.filesystem in the Specifications doc. So we would have to issue a command like this to get EXT4?

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true volume.block.filesystem=ext4

or would that now be

... zfs.block.filesystem=ext4

RE: Attaching block disk to unpriviledged containers So I was bind mounting the attached block disk formatted with EXT4 via fstab, for a few reasons:

  • get Docker on a supported file system when the underlying storage pools are ZFS
  • put all data to be backed up on a separate storage volume for snapshots of specific data
  • being able to plug the volume/copy/snapshot into another instance if needed
  • we traditionally have OS disks on cheaper SAS HDD RAID10, and data (clustered) on fast more expensive enterprise grade NVMe
# docker disk
#   - /mnt-acme/c1_disk00: where the block device is mounted
#   - c1_disk00: underlying block volume name : {container-name}_disk{instance}
# 
/mnt-acme/c1_disk00/var/lib/docker /var/lib/docker none auto,bind 0 0
/mnt-acme/c1_disk00/etc/docker /etc/docker none auto,bind 0 0
/mnt-acme/c1_disk00/some/other/important/data /some/other/important/data none auto,bind 0 0

markrattray avatar May 17 '23 09:05 markrattray

or would that now be

... zfs.block.filesystem=ext4

Yes thats correct. It defaults to ext4 if not specified.

tomponline avatar May 17 '23 09:05 tomponline

Out of interest what is your use case for this?

Quickly attaching multiple drives to multiple unprivileged LXC instances/containers in order to more accurately simulate different scenarios under which minio multinode multidrive systems could be configured - without having to physically create the systems first. When you have something let us know - I will be one of the first to test!

allanrogerr avatar May 17 '23 12:05 allanrogerr

The reason I ask is that attaching block disks to unprivileged containers is still going to be fairly restricted as unprivileged containers don't allow mounting. So if the application is using the block device directly then that is fine, but if the requirement is for attaching a block disk and then mounting it inside the container then you may be disapointed.

This is fine - once this is implemented we should be able to allow mounting for the unprivileged containers with the following, if Im not mistaken

lxc config set c1 security.syscalls.intercept.mount.allowed=ext4 security.syscalls.intercept.mount=true

allanrogerr avatar May 17 '23 13:05 allanrogerr

The reason I ask is that attaching block disks to unprivileged containers is still going to be fairly restricted as unprivileged containers don't allow mounting. So if the application is using the block device directly then that is fine, but if the requirement is for attaching a block disk and then mounting it inside the container then you may be disapointed.

This is fine - once this is implemented we should be able to allow mounting for the unprivileged containers with the following, if Im not mistaken

lxc config set c1 security.syscalls.intercept.mount.allowed=ext4 security.syscalls.intercept.mount=true

Yes, if you're happy with the risks involved (a corrupted or maliciously constructed superblock could crash the host kernel).

tomponline avatar May 17 '23 13:05 tomponline

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true volume.block.filesystem=ext4

@markrattray Are you sure, I was only able to get this to work with the following:

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true block.filesystem=ext4

Do you know if there is a glossary of configuration_options out there? Otherwise I would just infer it from https://github.com/lxc/lxd/blob/master/lxd/storage/drivers/driver_zfs_volumes.go#L1463

allanrogerr avatar May 17 '23 13:05 allanrogerr

Ah yes, sorry, its block.filesystem.

See the tip at the bottom of https://linuxcontainers.org/lxd/docs/master/reference/storage_zfs/#storage-pool-configuration that links to https://linuxcontainers.org/lxd/docs/master/howto/storage_volumes/#storage-configure-vol-default

tomponline avatar May 17 '23 13:05 tomponline

Although block.filesystem and block.mount_options are missing from https://linuxcontainers.org/lxd/docs/master/reference/storage_zfs/#storage-volume-configuration @ru-fu / @monstermunchkin are you able to add it based on the one from https://linuxcontainers.org/lxd/docs/master/reference/storage_lvm/#storage-volume-configuration

tomponline avatar May 17 '23 13:05 tomponline

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true volume.block.filesystem=ext4

@markrattray Are you sure, I was only able to get this to work with the following:

lxc storage volume create zfs vol1 size=1GiB zfs.block_mode=true block.filesystem=ext4

Do you know if there is a glossary of configuration_options out there? Otherwise I would just infer it from https://github.com/lxc/lxd/blob/master/lxd/storage/drivers/driver_zfs_volumes.go#L1463

Hi, sorry it's theory... I've not tried the new method using zfs.block_mode. I have been able to get block vols mounted in unpriviledged containers without it, and via Ansible as well: https://discuss.linuxcontainers.org/t/mounting-zvol-securely-in-container/4422/2 https://discuss.linuxcontainers.org/t/few-questions-about-zvols-on-lxd-manged-zfs-pools/14606

markrattray avatar May 17 '23 14:05 markrattray

usecase idea 1: would this also make it viable or at least possible to make block devices / unformatted filesystems available in containers ? so it is possible to run a ceph cluster in containers using parts of disks using zfs volumes for bluestore? I understand that it is not possible to make pure gpt partitions available in containers, but with zfs and zvol and zfs now in containers, it becomes possible to at least make block devices available with zfs as a layer ?

usecase idea 2: based on 1, would it also become possible to use rook with ceph in kubernetes nodes in lxd containers ? (which is possible now with vms, but not with containers yet, until this makes it possible ?)

casperan avatar Aug 25 '23 10:08 casperan