Per-node devicePathFilter is ignored if useAllNodes is true
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior: When specifying both a catch-all value, as well as per-node values for dedicated storage nodes, all dedicated storage node configuration is completely ignored.
Expected behavior: Node-specific configuration values should be respected, even when including nodes that don't have explicit configuration.
How to reproduce it (minimal and precise): With a simple setup like the following, using known scratch disk paths on all nodes, as well as all SAS drives on dedicated storage nodes;
# ...
storage:
useAllNodes: true
useAllDevices: false
devicePathFilter: "^/dev/disk/by-path/pci-0000:00:1f.2-ata-[23]"
nodes:
- name: cephdisk01.example.com
devicePathFilter: "^/dev/disk/by-path/pci-.+-sas-.+"
# ...
The generated osd-prepare pods rook-ceph-osd-prepare-cephdisk01.example.com-q4lnz etc, configure their filter environment variable with the global filter, instead of the node-specific values;
# ...
- name: ROOK_DATA_DEVICE_PATH_FILTER
value: ^/dev/disk/by-path/pci-0000:00:1f.2-ata-[23]
# ...
When removing the spec.storage.devicePathFilter value - but keeping spec.storage.useAllNodes=true and spec.storage.nodes.*, it instead generates osd-prepare pods that are entirely without a ROOK_DATA_DEVICE_PATH_FILTER value, even when the node has a value specified.
Setting spec.storage.useAllNodes=false causes something closer to the expected behaviour, osd-prepare pods which use the node-specific values and actually discover the intended drives. But it completely breaks the discovery of scratch disks on the worker nodes.
Environment:
- OS (e.g. from /etc/os-release): openSUSE Leap 15.6
- Kernel (e.g.
uname -a): 6.4.0-150600.23.73-default - Cloud provider or hardware configuration: 32x worker nodes, 6x storage nodes (which use external SAS enclosures)
- Rook version (use
rook versioninside of a Rook Pod): v1.18.6 - Storage backend version (e.g. for ceph do
ceph -v): 19.2.3 - Kubernetes version (use
kubectl version): v1.33.6 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Regular Kubernetes
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox): HEALTH_OK
@ananace If individual nodes are specified under the nodes field, then useAllNodes must be set to false. For example. So this sounds like an expected behavior.
@sp98 I have been looking through the documentation, and I've not found a single place where such a thing is stated. In fact, the example you've linked includes both a nodes list alongside a global deviceFilter (albeit an empty one), which seems to hint more towards the fact that you should be able to use both.
If this is supposed to be intended behavior, then the documentation needs to actually say so somewhere. But it really does feels more like an arbitrary limitation rather than a design choice.