lxd icon indicating copy to clipboard operation
lxd copied to clipboard

Storage: Fix PowerFlex apparmor warnings

Open roosterfish opened this issue 1 year ago • 14 comments

During the work on the SDC mode in PowerFlex I was closely following the dmesg output throughout the entire lxd-ci tests/storage-vm test suite (when executed for the NVMe/TCP mode) and I found some apparmor warnings that interestingly do not cause any errors on LXD side but indicate some missing access.

They only appear from time to time and do not happen consistently. The IPs/ports listed in the error messages correspond to the NVMe/TCP connections to the PowerFlex SDT's:

Screenshot from 2024-06-05 09-04-27

This PR adds additional rules to both the rsync and qemuimg profile to mitigate the following errors. At least one of the warnings was reported unrelated to PowerFlex here https://github.com/canonical/lxd/issues/13585.

Screenshot from 2024-06-04 12-12-47

nvme_apparmor

roosterfish avatar Jun 04 '24 11:06 roosterfish

Great observation, Julian!

found some apparmor warnings that interestingly do not cause any errors on LXD side but indicate some missing access.

this is surprising. Because these errors come from NVMe over TCP internals. And it means that block device driver fails to perform some network requests which should prevent any IO (except the case when data is cached).

And it is a bit weird, that LSM controls and checks network access for kernel block device. This is something we should look into from the kernel side I believe.

mihalicyn avatar Jun 04 '24 16:06 mihalicyn

@roosterfish Im a bit confused why rsync is being blocked to send data directly to an IP - normally we wrap rsync over the network inside a websocket, so the rsync process itself isn't sending to a remote IP.

In what situations are you getting these warnings?

tomponline avatar Jun 05 '24 06:06 tomponline

In what situations are you getting these warnings?

I have updated the PR's description to clarify on this. The IPs/ports are from the target systems (PowerFlex SDT's) the host connects to via NVMe/TCP.

roosterfish avatar Jun 05 '24 07:06 roosterfish

In what situations are you getting these warnings?

I have updated the PR's description to clarify on this. The IPs/ports are from the target systems (PowerFlex SDT's) the host connects to via NVMe/TCP.

So is rsync directly communicating with powerflex or is Linux interpreting a write to a locally mapped nvme over TCP block device as a message being sent to the remote server?

tomponline avatar Jun 05 '24 07:06 tomponline

So is rsync directly communicating with powerflex or is Linux interpreting a write to a locally mapped nvme over TCP block device as a message being sent to the remote server?

It's presumably the latter as the PowerFlex driver doesn't have any specific logic when doing the rsync. It also uses the same generic functions we have in LXD for volume transfer.

roosterfish avatar Jun 05 '24 07:06 roosterfish

static checks not happy

tomponline avatar Jun 05 '24 07:06 tomponline

It's presumably the latter as the PowerFlex driver doesn't have any specific logic when doing the rsync. It also uses the same generic functions we have in LXD for volume transfer.

Thanks. That feels like a layering violation to me, as it shouldn't be needed for every program that accesses the device to need to be explicitly allowed to send packets to the mapped device's endpoint, after all the programs themselves are not sending the packets, but the underlying OS. What do you think @mihalicyn ?

tomponline avatar Jun 05 '24 07:06 tomponline

That feels like a layering violation to me, as it shouldn't be needed for every program that accesses the device to need to be explicitly allowed to send packets to the mapped device's endpoint, after all the programs themselves are not sending the packets, but the underlying OS. What do you think @mihalicyn ?

This is what I've said above ;-)

This is something we should investigate from the kernel side.

mihalicyn avatar Jun 05 '24 11:06 mihalicyn

This is something we should investigate from the kernel side.

OK so we should not add it to the apparmor policy yet then?

tomponline avatar Jun 05 '24 11:06 tomponline

OK so we should not add it to the apparmor policy yet then?

I think we left with no choice and we have to add this. What I can not understand is why nothing is failing from the LXD side? If rsync/qemu-img are failing on the network requests produced from the nvme block device it should make a device faulty and cause EIO. But for some reason it works. Why? page cache? I would try to put something like echo 3 > /proc/sys/vm/drop_caches just before each qemu-img/rsync calls and retest if it breaks things (it should!) If not, I guess we must dive into it and get some understanding of how it works with such a critical network errors and why.

mihalicyn avatar Jun 06 '24 12:06 mihalicyn

I would try to put something like echo 3 > /proc/sys/vm/drop_caches just before each qemu-img/rsync calls and retest

I will test this, thanks for the suggestion. Which value do I have to echo into /proc/sys/vm/drop_caches afterwards to reset this?

roosterfish avatar Jun 06 '24 13:06 roosterfish

Which value do I have to echo into /proc/sys/vm/drop_caches afterwards to reset this?

a, good question, you don't really need to echo anything after this. echo 3 > /proc/sys/vm/drop_caches is a single-shot thing. It just drops all the caches one time but does not disable caching.

mihalicyn avatar Jun 06 '24 13:06 mihalicyn

I think we need to confirm if anything is actually breaking first before we proceed with this change as otherwise we are going to end up weakening the apparmor profile we use when calling rsync for "local" copies to allow it to make network connections that should be unnecessary, and this will be for all storage drivers, not just powerflex.

If its a kernel bug and/or not causing any actual problems then we shouldnt need to workaround it in LXD at the expense of reduced security.

tomponline avatar Jun 07 '24 07:06 tomponline

@mihalicyn I have put this right in front of the qemu-img and rsync operations. Neither do the errors in the kernels log look different nor does any of the errors get propagated to the caller of qemu-img/rsync: Screenshot from 2024-06-25 17-46-53

roosterfish avatar Jun 25 '24 15:06 roosterfish

@roosterfish shall we close this?

tomponline avatar Jul 03 '24 08:07 tomponline

@roosterfish shall we close this?

As it's clearly not causing any issues/errors on the LXD side I will close it for now.

roosterfish avatar Jul 03 '24 08:07 roosterfish