bottlerocket icon indicating copy to clipboard operation
bottlerocket copied to clipboard

Metal: Support iSCSI initiator and NFS client for EKS-A Bare Metal persistent storage

Open Cajga opened this issue 3 years ago • 16 comments

What I'd like: We would like to use Bottlerocket for EKS-A Bare Metal in our DC. For persistent storage, we are going to use NetApp and their Astra Trident (formerly known as Trident) CSI driver solution. When a PVC is created in EKS-A, Astra Trident will create iSCSI or NFSv4 volumes on our NetApps clusters (and respective PVs in EKS-A) and bind these to the PVC. When a pod is using such a volume, the Astra Trident CSI pod (coming from a daemonset) will mount the NFS into the pod or in case of iSCSI would create an FS on the iSCSI volume and mount it into the pod.

In order to be able to do this, Astra Trident has some requirements for the Kubernetes worker node.

For NFS:

  • on Red Hat Linux variants it requires the nfs-utils package (on Ubuntu it requires the nfs-common)

For iSCSI:

  • on Red Hat variants it requires the following packages: lsscsi iscsi-initiator-utils sg3_utils device-mapper-multipath
  • sudo sed -i 's/^(node.session.scan).*/\1 = manual/' /etc/iscsi/iscsid.conf
  • sudo mpathconf --enable --with_multipathd y --find_multipaths n
  • sudo systemctl enable --now iscsid multipathd
  • sudo systemctl enable --now iscsi
  • please also note that the file which contains the per host unique initiator ID, must be persistent (on RHEL7 it is /etc/iscsi/initiatorname.iscsi and it is generated in a post install script when you install the iscsi-initiator-utils package)

Any alternatives you've considered: There is no alternative. This issue blocks us to use Bottlerocket for EKS-A and forces us to build our own worker image (as only Bottlerocket is shipped with EKS-A BM). Would be nice if there would be a list of supported CSI drivers for bottlerocket (which can be used with some storage in standard DC and not in AWS region)

Cajga avatar Nov 11 '22 15:11 Cajga

HI @Cajga - thanks for the request!

We'll have to take a deeper look at this CSI driver and its requirements. It's interesting that the packages are required host side vs. in the CSI daemonset. Outside of the mentioned packages, we'll also need to figure out what other assumptions are made about the host that wouldn't be true for Bottlerocket, i.e. the existence of a shell, etc.

We don't currently include the nfs-utils or any of the iSCSI packages in Bottlerocket. Which are you targeting?

zmrow avatar Nov 11 '22 17:11 zmrow

Hi @zmrow,

Thanks for looking into this.

Well, we have use cases for both iSCSI and NFSv4 (we have been using NetAPP + Trident in EKS on Outpost). We have iSCSI as default storage class in our clusters (with xfs on top) which is used in about 95% of the cases. There are few cases when we need ReadWriteMany (RWX) volumes then we use the nfsv4 storage class. We would need to take a look if we could get rid of the RWX volumes to rely only on iSCSI.

For iSCSI, the host must have an initiator id which should not change. As I mentioned above, this is normally generated at install time of the iSCSI packages on standard Linux. Trident will look for this ID on the host to register each nodes in the NetApp cluster. Without this information the NetApps would not give iSCSI volumes to the host. I wonder if such ID would be in a right place inside the daemonset. Or, maybe the daemonset could generate it at first run and persist it somehow, somewhere into bottlerocket. Is there a different CSI driver which manage iSCSI volumes and work with Bottlerocket? I could open a support case with NetApp to take a look on their implementation...

Cajga avatar Nov 11 '22 18:11 Cajga

Maybe relevant and could help a bit in planning (id is not generated at install time): In RHEL9, the /etc/iscsi/initiatorname.iscsi file (which contains the id like InitiatorName=iqn.1994-05.com.redhat:a5a8dd56f673 ) gets created at the first start of the iscsid.service and not at install time.

Cajga avatar Nov 11 '22 18:11 Cajga

@Cajga do you know if there are other host requirements besides the nfs/iscsi packages mentioned?

zmrow avatar Nov 23 '22 17:11 zmrow

@zmrow well, let me collect what I am aware of:

  • for iSCSI, when the daemonset starts, it tries to read the initiator ID from the /etc/iscsi/initiatorname.iscsi file. For each node, this ID must be persistent and unique. As such, it should not be included at image build time but rather generated during install of a node (note: as I mentioned above, depending on the OS, this file may get generated at iscsi package install time or when one starts the iscsi daemon).
  • for iSCSI, it is using multipathd to configure multipath devices for each "path" to the NetApp cluster. Then it is optionally creating an FS on top of these (defaults to xfs) before it gets mounted into the container which has the volume configured. For this, one needs to modify and persist the configuration of multipathd as well (see in the ticket description).
  • for iSCSI, it requires that the OS runs the following daemons: iscsi, iscsid, multipathd
  • fortunately, it is open source so, I did a search in the source and found that trident may call the following commands (the "how" can be found here ): systemctl, tridentctl, cat, mkdir, mount, umount, ls, lsscsi, free, pgrep, multipathd, iscsiadm, df, mkfs.xfs, mkfs.ext3, mkfs.ext4, cryptsetup, blkid, multipath, losetup

I am happy to help to test in case you decide to give it a go.

Cajga avatar Nov 24 '22 07:11 Cajga

It might be worth noting that the Trident CSI driver is also used to integrate AWS FSx Ontap with EKS clusters: https://docs.aws.amazon.com/eks/latest/userguide/fsx-ontap.html

Same requirements apply.

Solving this will therefore not just be relevant for EKS-A bare metal, but also for EKS (with bottlerocket).

It's interesting that the packages are required host side vs. in the CSI daemonset.

Storage access is always from the host. The CSI driver orchestrates it but does not sit in the datapath.

wonderland avatar Nov 24 '22 12:11 wonderland

Most CSI drivers that use iSCSI also rely on multipathd to be functional. The HPE CSI Driver for Kubernetes expect these commands to exist on the host (btrfs being optional):

blkid
blockdev
btrfs
dmidecode
dnsdomainname
find
fsck
ip
iscsiadm
lsblk
lsscsi
mkfs.btrfs
mkfs.ext3
mkfs.ext4
mkfs.xfs
mount
multipath
multipathd
resize2fs
sg_inq
umount
xfs_growfs

datamattsson avatar Mar 28 '23 17:03 datamattsson

The lack of the possibility to use persistent storage is one of the major limitations (and the only blocker in our case with EKS-A BM) of using bottlerocket on bare metal deployments.

Could you please update us with you plans in regards to this issue?

Cajga avatar Apr 13 '23 18:04 Cajga

Hi @Cajga,

Supporting iSCSI has some follow-on effects which are not easy to support today. Effectively, it either has to be supported through a new variant or through adding additional software packages to the base OS image (which may only be required by a subset of users), both of which are not ideal and likely not sustainable in the long run. The larger problem here is meeting needs of users with specific requirements without causing follow-on effects. It's not an easy problem to solve.

Earlier this year, the team started working on an out-of-tree build system for Bottlerocket (https://github.com/bottlerocket-os/bottlerocket/issues/2669), which addresses the larger problem without causing others: you can build Bottlerocket and add in the specific supporting software/drivers you need without having to fully roll-your-own variant nor having to maintain everything yourself. So, currently team is focusing on that effort instead of creating new variants or adding very specific supporting software to existing variants.

So there are currently no short-term plans to add iSCSI support directly, but the problem is likely to be resolved by the flexibility Bottlerocket will gain through out-of-tree builds.

etungsten avatar Apr 13 '23 22:04 etungsten

Hi @etungsten,

Thank you for the update, it seems to be a very interesting approach. Looking forward to test ot when it is ready.

I must say, I don't really see at the moment how would an OOTB variant be shipped and supported with EKS Anywhere for example. Like for EKS-A Bare Metal installation, a variant that contains the necessary packages and daemons to be able to use iSCSI and NFS based persistent storage would make sense but it may not be needed to other use cases...

Cajga avatar Apr 14 '23 18:04 Cajga

There seems to be an official variant called metal-k8s-VERSION.

From the Readme:

The following variants are designed to be Kubernetes worker nodes on bare metal

I wonder how would this goal be achieved without the possibility to use persistent storage. Or are you planning to depricate the metal variant when the OOTB solution is ready?

Cajga avatar Apr 14 '23 18:04 Cajga

Can we expect iscsi support in Bottlerocket anytime soon?

d3bt3ch avatar Sep 13 '23 23:09 d3bt3ch

Can we expect iscsi support in Bottlerocket anytime soon?

It is still in the queue of things we would like to see, but so far no one has been able to work on it yet. Contributions welcome of course, but otherwise this will be in the backlog until someone can devote some time to it.

It is good to see these comments on the issue to help gauge interest. That may help when trying to decide how to prioritize some of these backlog items. So please do feel free to chime in if anyone else would like to see the supported!

stmcginnis avatar Sep 14 '23 00:09 stmcginnis

+1 - Adding comment on behalf of one of our customers. This limitation affects the customers needing PVs with RWX access mode in Bottlerocket based EKS-A clusters.

cparik avatar Sep 21 '23 03:09 cparik

We would like to see a resolution to this as a paying EKS-A customer. We have NetApp ONTAP storage we'd like to use with Astra Trident. We also want the advantages of using Bottlerocket.

jda258 avatar Sep 22 '23 13:09 jda258

It might be worth noting that the Trident CSI driver is also used to integrate AWS FSx Ontap with EKS clusters: docs.aws.amazon.com/eks/latest/userguide/fsx-ontap.html

Same requirements apply.

Solving this will therefore not just be relevant for EKS-A bare metal, but also for EKS (with bottlerocket).

It's interesting that the packages are required host side vs. in the CSI daemonset.

Storage access is always from the host. The CSI driver orchestrates it but does not sit in the datapath.

Since this seems to be the official tracking issue for iSCSI integration, could we please emphasize that this is not a bare metal-only issue, but is also blocking the use of FSx Ontap with EKS (probably others) by adjusting the title for example?

trc-ikeskin avatar Jun 24 '24 08:06 trc-ikeskin