piraeus-operator
piraeus-operator copied to clipboard
Operator v2: tracking features
We've recently started work on Operator v2
This is intended as a list of features that need to be ported from v1, or features we want to add in v2:
- [ ] Cluster Setup:
- [x] Create controller
- [x] Create satellites
- [x] Register satellites
- [x] Sync node labels to satellites
- [ ] Evict satellites
- [ ] Add multiple NICs if multus is used
- [x] Satellite set up using new LinstorSatelliteConfig resource with node selectors
- [ ] DRBD:
- [x] Detect loader image to use per node
- [ ] Optionally use host network instead of pod network
- [ ] Idea: rolling upgrades when node not in use?
- [ ] CSI:
- [x] Driver setup
- [ ] Restart when node labels change
- [ ] Test with s3 backup
- [ ] Packaging & Release
- [ ] Helm chart
- [ ] Update instructions
- [ ] Upgrade from v1 guide
- [ ] Migration script for most common cases
- [ ] DB migration from non-k8s backend #341
Note: this list is not complete. If there is something to be added, please comment below
@WanzenBug It would be a great chance to add a migration script to the the new K8S backend and make it mandatory for the operator V2 :-) OFC I can also create a separate issue if you prefer to track that independently.
Consider using Kustomize as the default deployment tool. Allows for greater control and from a maintainability point of view its simpler to patch resources instead of templating them.
Good example of where this is used is https://github.com/kubernetes-sigs/node-feature-discovery
Consider using Kustomize as the default deployment tool
On that front, I can report that the v2 branch is already using kustomize, both as a way to deploy the operator and as a way to customize the actually deployed resources. Basically, you can attach kustomize patches to the resources managed by the operator.
We are still thinking about adding some form of Helm chart, since a lot of users are still used to that.
Along with the registry, could we configure the image pull secrets, pull policy and image tag? Makes it simpler for end-users to automate upgrading the application. For instance, using yq to adjust the image tag in a Makefile. More than happy to help contribute as always.
apiVersion: piraeus.io/v1
kind: LinstorCluster
metadata:
name: linstorcluster
spec:
imageSource:
repository: registry.example.com/piraeus
tag: v1.10.0
pullPolicy: IfNotPresent
pullSecrets:
- "SecretName"
Nice work on v2 so far!
There is a problem with the drbd-module-loader init-container and the lvm mounts on Talos nodes.
On Talos operating system we use extension to add things like kernel modules and the directory structure is a little bit different.
I've deployed the operator using kustomize from the config/default directory and have created the following CRDs:
apiVersion: piraeus.io/v1
kind: LinstorCluster
metadata:
name: linstor-cluster
spec:
nodeSelector:
node-role.kubernetes.io/linstor: ""
---
apiVersion: piraeus.io/v1
kind: LinstorSatelliteConfiguration
metadata:
name: all-satellites
spec:
storagePools:
- name: fs1
filePool: {}
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: simple-fs
parameters:
csi.storage.k8s.io/fstype: xfs
# linstor.csi.linbit.com/autoPlace: "3" # not sure what this does = replica?
linstor.csi.linbit.com/storagePool: fs1
provisioner: linstor.csi.linbit.com
volumeBindingMode: WaitForFirstConsumer
The drbd-module-loader init-container of the "node pod" tries to hostPath mount /usr/lib/modules, which does not exist on Talos making kubelet error:
MountVolume.SetUp failed for volume "usr-lib-modules" : hostPath type check failed: /usr/lib/modules is not a directory
After manually removing the init-container and building my own image of the operator I found we also can't mount hostPath /etc/lvm/...?:
spec: failed to generate spec: failed to mkdir "/etc/lvm/archive": mkdir /etc/lvm/archive: read-only file system
Only the drbd-reactor container can start.
Since I only want to use file backed storage pools I removed all lvm mounts from the linstor-satellite container and rebuild my operator image.
Now the "node pod" starts successfully and PVs work!
15:10:27.686 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Controller connected and authenticated (10.0.4.220:51996)
15:10:27.894 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'oracle' created.
15:10:27.899 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'DfltDisklessStorPool' created.
15:10:27.970 [DeviceManager] INFO LINSTOR/Satellite - SYSTEM - Removing all res files from /var/lib/linstor.d
15:10:27.972 [DeviceManager] WARN LINSTOR/Satellite - SYSTEM - Not calling 'systemd-notify' as NOTIFY_SOCKET is null
15:10:30.600 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'fs1' created.
15:21:39.079 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-963ad95c-0812-46d2-9105-adf0d3812558' created for node 'oracle'.
15:21:39.624 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Primary Resource pvc-963ad95c-0812-46d2-9105-adf0d3812558
15:21:39.624 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Primary bool set on Resource pvc-963ad95c-0812-46d2-9105-adf0d3812558
15:21:39.689 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-963ad95c-0812-46d2-9105-adf0d3812558' updated for node 'oracle'.
15:21:39.851 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-963ad95c-0812-46d2-9105-adf0d3812558' updated for node 'oracle'.
15:21:41.011 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-963ad95c-0812-46d2-9105-adf0d3812558' updated for node 'oracle'.
$ talosctl -n 100.64.6.90 list /var/lib/linstor-pools
NODE NAME
100.64.6.90 .
100.64.6.90 fs1
$ talosctl -n 100.64.6.90 list /var/lib/linstor-pools/fs1
NODE NAME
100.64.6.90 .
100.64.6.90 pvc-963ad95c-0812-46d2-9105-adf0d3812558_00000.img
FYI: On Talos /usr/lib/ contains the following:
$ talosctl -n 100.64.6.90 list /usr/lib/
NODE NAME
100.64.6.90 .
100.64.6.90 cryptsetup
100.64.6.90 engines-1.1
100.64.6.90 libaio.so
....many .so files
100.64.6.90 udev
100.64.6.90 xfsprogs
$ talosctl -n 100.64.6.90 list /lib/modules
NODE NAME
100.64.6.90 .
100.64.6.90 5.15.86-talos
I can confirm I've installed the drbd and drbd_transport_tcp kernel modules from the Talos drbd extension:
$ talosctl -n 100.64.6.90 read /proc/drbd
version: 9.2.0 (api:2/proto:86-121)
GIT-hash: 71e60591f3d7ea05034bccef8ae362c17e6aa4d1 build by @buildkitsandbox, 2023-01-11 12:22:06
Transports (api:18): tcp (9.2.0)
Afterthoughts (I'm not an expert)
The /usr/lib/modules directory is typically where the Linux kernel stores the loadable modules (or drivers) that can be loaded into the kernel at runtime. This directory is typically only present on systems that use a monolithic kernel, which includes all the necessary drivers and modules built directly into the kernel.
In contrast, Talos is a kernel that uses a modular design, which means that it loads only the necessary modules at runtime. It does not have a /usr/lib/modules directory as it does not store the kernel modules on disk(?)
Because the script is trying to mount the /usr/lib/modules directory, it will not work as expected on a Talos machine. We may need to modify the script or can remove the init-container, but I think having some init container that runs the recommended modprobe drbd usermode_helper=disabled is very helpful to do on every linstor node.
refs: There are efforts on documenting how to use piraeus-operator on the Talos website: https://github.com/siderolabs/talos/pull/6426 which will help increase awareness of this great storage project.
@DJAlPee - Got this already working on Talos, but only the main branch piraeus v1 version. @cf-sewe @frezbo @smira - might be able to help support piraeus on this topic
just a note, loading modules on talos is disabled, since talos is a configuration driven os, module loading and it's parameters are specified in the machine config, so I guess having an option to disable the init container makes more sense
This sounds like the exact use-case we now have patches for:
apiVersion: piraeus.io/v1
kind: LinstorSatelliteConfiguration
metadata:
name: no-loader
spec:
patches:
- target:
kind: Pod
name: satellite
patch: |
apiVersion: v1
kind: Pod
metadata:
name: satellite
spec:
initContainers:
- name: drbd-module-loader
$patch: delete
This disables the init container on all nodes.
This seems to be a pretty nice approach!
In v1 I used operator.satelliteSet.kernelModuleInjectionMode=None in helm, which seems to have the same effect, but is deprecated (But why?).
but is deprecated (But why?).
Because you almost always want to use DepsOnly. I don't know if this would work on Talos, but it should ensure that all the other "useful" mods are loaded if they are available. So dm-thin, dm-crypt, etc...
Because you almost always want to use
DepsOnly. [...]
As @frezbo stated, module loading is disabled in Talos. So we have the "almost" case here 😉
I will update the documentation draft to use None, when using the v1 operator. I hope, you keep this functionality for v1 and remove it only in v2 😉
Operator v2 is released.