ostree-rs-ext
ostree-rs-ext copied to clipboard
OCI SELinux labeling mismatch when package only ships binary policy - greetd is broken
Hi OSTree team,
I am trying to debug an issue that surfaced in Ublue (https://github.com/ublue-os/main/issues/223). In Ublue we are making use of the OCI functionality heavily to produce custom downstream images of Fedora Silverblue. Some users have wanted to install greetd
as their greeter for simplicity but whenever they do the resulting image is slightly broken because of wrong SELinux labeling.
I have investigated the package and I see that /usr/bin/greetd
should be labeled as xdm_exec_t
(see: here) which makes sense for a display manager but in the resulting images we see that /usr/bin/greetd
is labeled as bin_t
and therefore access gets denied by SELinux when the system is booted.
This is the output of ls -lZ
in the images:
-rwxr-xr-x. 3 root root system_u:object_r:bin_t:s0 839488 Jan 1 1970 /usr/bin/greetd
this is the package source that shows the intended labeling:
I tried to trace down the issue and I found this function: sepolicy_from_base
:
1 fn sepolicy_from_base(repo: &ostree::Repo, base: &str) -> Result<tempfile::TempDir> {
2 let cancellable = gio::Cancellable::NONE;
3 let policypath = "usr/etc/selinux";
4 let tempdir = tempfile::tempdir()?;
5 let (root, _) = repo.read_commit(base, cancellable)?;
6 let policyroot = root.resolve_relative_path(policypath);
7 if policyroot.query_exists(cancellable) {
8 let policydest = tempdir.path().join(policypath);
9 std::fs::create_dir_all(policydest.parent().unwrap())?;
10 let opts = ostree::RepoCheckoutAtOptions {
11 mode: ostree::RepoCheckoutMode::User,
12 subpath: Some(Path::new(policypath).to_owned()),
13 ..Default::default()
14 };
15 repo.checkout_at(Some(&opts), ostree::AT_FDCWD, policydest, base, cancellable)?;
16 }
17 Ok(tempdir)
18 }
I am not an expert but if I understand correctly this will look for the labels defined in /usr/etc/selinux
in the resulting image and then it will restorecon
based on that. In this case the package only ships a binary policy so my theory was that this would not find the rule /usr/bin/greetd -- gen_context(system_u:object_r:xdm_exec_t,s0)
which is defined in a binary policy therefore leading to the wrong labeling we observed.
We can use this minimal Containerfile to reproduce reliably:
FROM quay.io/fedora/fedora-silverblue:39
RUN rpm-ostree install -y greetd \
&& ls -lZ /usr/bin/greetd
Yes. This relates to https://github.com/ostreedev/ostree-rs-ext/issues/388 where we want to basically move the labeling to the build time.
That said, something I think would also be interesting is to create a dedicated "image transform" container that can accept any arbitrary container as input and produce a "boot-ready" (including proper selinux labeling) container image. This would replace a need for ostree container commit
, and we could also do things like flatten layers in a configurable way and also (crucially) canonicalize timestamps etc.
To fix this in the current architecture (without changing ostree container commit
) is messy because we'd have to:
- fetch all the layers
- compute the final merged filesystem tree
- check out the final merged
/etc/selinux
at least - Use that policy to re-compute the final filesystem tree
Basically SELinux greatly complicates the architecture here because it's a set of files that can affect any other file. So far the "always use the policy from the base" is a compromise.
I guess the first step towards the "container transformer" is to have a container that is capable of doing the labelling needed.
Mm I've been thinking about this and trying to get a container that allows me to properly label these files - even if it's just a PoC
I've been using podman and I ran into the issue that podman passes a context=
option to the underlying overlayfs. This means inside the podman container no labelling actually works even if the container is privileged with label=nested
.
I'm just leaving this comment here for my own reference but if anyone has any clues that'd be helpful.
There's a lot of complexity here around xattrs and OCI and podman, see extensive discussion in https://github.com/containers/storage/pull/1608
An update here, we were able to troubleshoot around some non-filesystem related SELinux issues by shipping our custom policy and running semodule
Update: It works! You just have to run
semodule -n -s targeted -X 200 -i /usr/share/selinux/packages/targeted/yourpolicyorwhereverelseitslocated.pp
in the image in the CI as well. This is what the standard .rpm function%selinux_modules_install
does as far as I can tell.
This specific policy does not have any changes to the filesystem so I am not sure if it would be affected by this but noting here just for reference.
In theory if can just load the policy from within the container this may work? (I didn’t know GitHub let you do that until today)
(sorry for the noise)
mm it seems that if we use podman btrfs
storage driver, the overlayfs stuff gets entirely side-stepped and SELinux works in a rootless container even (we still need to pass /etc/selinux through as libselinux checks that)
podman run --rm -it --security-opt label=nested --privileged -v /etc/selinux:/etc/selinux greetd:bugx /usr/sbin/restorecon -RFv /usr/bin/greetd
Relabeled /usr/bin/greetd from system_u:object_r:container_file_t:s0:c1022,c1023 to system_u:object_r:xdm_exec_t:s0
so in theory if we can just grab a container, run it using the btrfs
storage driver and just run restorecon -Rv /
and save the resulting image then this should work?
so in theory if we can just grab a container, run it using the btrfs storage driver and just run restorecon -Rv / and save the resulting image then this should work?
Not really, ostree wants to own the security contexts, trying to change them behind its back won't really work.
The most reliable is going to be https://github.com/ostreedev/ostree-rs-ext/issues/510#issuecomment-1660662722 but it's ugly to implement today.
Yeah, this also breaks layering libvirt and enabling swtpm. Annoying.
Does OSTREE just need the final /etc/selinux/targeted/context/files directory? Or do the files actually have to be applied as XAttrs within the image?
For swtpm, what would be the most minimal work around right now?
This breaks VMs quite a bit for me.
For my minimal workaround. I am using a systemd service to copy the swtpm binary to a mutable location. Bind mounting the mutable binary back over /usr/bin/swtpm. Then fix the file contexts and relabel it. What are the downsides for this work around?
# /etc/systemd/system/swtpm-workaround.service
[Unit]
Description=Workaround swtpm not having the correct label
ConditionFileIsExecutable=/usr/bin/swtpm
After=local-fs.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/swtpm-workaround
[Install]
WantedBy=multi-user.target
# /usr/local/bin/swtpm-workaround
#!/usr/bin/env bash
cp /usr/bin/swtpm /usr/local/bin/swtpm
mount --bind /usr/local/bin/swtpm /usr/bin/swtpm
semanage fcontext -a -t swtpm_exec_t "/usr/bin/swtpm"
restorecon /usr/bin/swtpm
ls -lZ /usr/bin/swtpm
-rwxr-xr-x. 1 root root system_u:object_r:swtpm_exec_t:s0 42136 Dec 14 10:21 /usr/bin/swtpm
This also relates to https://github.com/containers/bootc/pull/215 - that's where I'm thinking about going with this.
Another workaround is basically:
bootc usroverlay
restorecon /usr/bin/swtpm