lxd Storage: Fix PowerFlex apparmor warnings

During the work on the SDC mode in PowerFlex I was closely following the dmesg output throughout the entire lxd-ci tests/storage-vm test suite (when executed for the NVMe/TCP mode) and I found some apparmor warnings that interestingly do not cause any errors on LXD side but indicate some missing access.

They only appear from time to time and do not happen consistently. The IPs/ports listed in the error messages correspond to the NVMe/TCP connections to the PowerFlex SDT's:

Screenshot from 2024-06-05 09-04-27

This PR adds additional rules to both the rsync and qemuimg profile to mitigate the following errors. At least one of the warnings was reported unrelated to PowerFlex here https://github.com/canonical/lxd/issues/13585.

Screenshot from 2024-06-04 12-12-47

nvme_apparmor

Jun 04 '24 11:06 roosterfish

Great observation, Julian!

found some apparmor warnings that interestingly do not cause any errors on LXD side but indicate some missing access.

this is surprising. Because these errors come from NVMe over TCP internals. And it means that block device driver fails to perform some network requests which should prevent any IO (except the case when data is cached).

And it is a bit weird, that LSM controls and checks network access for kernel block device. This is something we should look into from the kernel side I believe.

Jun 04 '24 16:06 mihalicyn

@roosterfish Im a bit confused why rsync is being blocked to send data directly to an IP - normally we wrap rsync over the network inside a websocket, so the rsync process itself isn't sending to a remote IP.

In what situations are you getting these warnings?

Jun 05 '24 06:06 tomponline

In what situations are you getting these warnings?

I have updated the PR's description to clarify on this. The IPs/ports are from the target systems (PowerFlex SDT's) the host connects to via NVMe/TCP.

Jun 05 '24 07:06 roosterfish

In what situations are you getting these warnings?

I have updated the PR's description to clarify on this. The IPs/ports are from the target systems (PowerFlex SDT's) the host connects to via NVMe/TCP.

So is rsync directly communicating with powerflex or is Linux interpreting a write to a locally mapped nvme over TCP block device as a message being sent to the remote server?

Jun 05 '24 07:06 tomponline

So is rsync directly communicating with powerflex or is Linux interpreting a write to a locally mapped nvme over TCP block device as a message being sent to the remote server?

It's presumably the latter as the PowerFlex driver doesn't have any specific logic when doing the rsync. It also uses the same generic functions we have in LXD for volume transfer.

Jun 05 '24 07:06 roosterfish

static checks not happy

Jun 05 '24 07:06 tomponline

It's presumably the latter as the PowerFlex driver doesn't have any specific logic when doing the rsync. It also uses the same generic functions we have in LXD for volume transfer.

Thanks. That feels like a layering violation to me, as it shouldn't be needed for every program that accesses the device to need to be explicitly allowed to send packets to the mapped device's endpoint, after all the programs themselves are not sending the packets, but the underlying OS. What do you think @mihalicyn ?

Jun 05 '24 07:06 tomponline

That feels like a layering violation to me, as it shouldn't be needed for every program that accesses the device to need to be explicitly allowed to send packets to the mapped device's endpoint, after all the programs themselves are not sending the packets, but the underlying OS. What do you think @mihalicyn ?

This is what I've said above ;-)

This is something we should investigate from the kernel side.

Jun 05 '24 11:06 mihalicyn

This is something we should investigate from the kernel side.

OK so we should not add it to the apparmor policy yet then?

Jun 05 '24 11:06 tomponline

OK so we should not add it to the apparmor policy yet then?

I think we left with no choice and we have to add this. What I can not understand is why nothing is failing from the LXD side? If rsync/qemu-img are failing on the network requests produced from the nvme block device it should make a device faulty and cause EIO. But for some reason it works. Why? page cache? I would try to put something like echo 3 > /proc/sys/vm/drop_caches just before each qemu-img/rsync calls and retest if it breaks things (it should!) If not, I guess we must dive into it and get some understanding of how it works with such a critical network errors and why.

Jun 06 '24 12:06 mihalicyn

I would try to put something like echo 3 > /proc/sys/vm/drop_caches just before each qemu-img/rsync calls and retest

I will test this, thanks for the suggestion. Which value do I have to echo into /proc/sys/vm/drop_caches afterwards to reset this?

Jun 06 '24 13:06 roosterfish

Which value do I have to echo into /proc/sys/vm/drop_caches afterwards to reset this?

a, good question, you don't really need to echo anything after this. echo 3 > /proc/sys/vm/drop_caches is a single-shot thing. It just drops all the caches one time but does not disable caching.

Jun 06 '24 13:06 mihalicyn

I think we need to confirm if anything is actually breaking first before we proceed with this change as otherwise we are going to end up weakening the apparmor profile we use when calling rsync for "local" copies to allow it to make network connections that should be unnecessary, and this will be for all storage drivers, not just powerflex.

If its a kernel bug and/or not causing any actual problems then we shouldnt need to workaround it in LXD at the expense of reduced security.

Jun 07 '24 07:06 tomponline

@mihalicyn I have put this right in front of the qemu-img and rsync operations. Neither do the errors in the kernels log look different nor does any of the errors get propagated to the caller of qemu-img/rsync: Screenshot from 2024-06-25 17-46-53

Jun 25 '24 15:06 roosterfish

@roosterfish shall we close this?

Jul 03 '24 08:07 tomponline

@roosterfish shall we close this?

As it's clearly not causing any issues/errors on the LXD side I will close it for now.

Jul 03 '24 08:07 roosterfish