SELinux behaves completely differently when I ```podman run``` vs. ```bootc switch``` my container image
The behavior of SELinux seems completely different when I run my bootable container under podman run vs deploying it (for example via bootc switch).
I think that SELinux works differently for containers (which it treats as a single security context) than it does with hosts (very fine grained). This means many commands such as WORKDIR or RUN in my Containerfile have unexpected results.
The following Containerfile doesn't work (as an example):
FROM quay.io/centos-bootc/centos-bootc:stream9
#Substitute YOUR public key for the below-private key holder for the following public key will have root access
# podman build --build-arg="SSHPUBKEY=$(cat $HOME/.ssh/id_rsa.pub)" ...
ARG SSHPUBKEY
RUN mkdir /usr/etc-system && \
echo 'AuthorizedKeysFile /usr/etc-system/%u.keys' >> /etc/ssh/sshd_config.d/30-auth-system.conf && \
echo $SSHPUBKEY > /usr/etc-system/root.keys && chmod 0600 /usr/etc-system/root.keys
WORKDIR /locallm/models
RUN curl -LO https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf
WORKDIR /locallm
RUN dnf -y install pip gcc-c++ python3-devel cmake
COPY requirements.txt /locallm/requirements.txt
RUN pip install --upgrade pip
RUN CMAKE_ARGS="-DLLAMA_NATIVE=off" FORCE_CMAKE=1 pip install --no-cache-dir --upgrade -r /locallm/'llama-cpp-python[server]'
COPY run.sh run.sh
COPY run.service /etc/systemd/system/
RUN ln -s /etc/systemd/system/run.servince /etc/systemd/system/multi-user.target.wants/run.service
# The following steps should be done in the bootc image.
CMD [ "/sbin/init" ]
STOPSIGNAL SIGRTMIN+3
RUN rpm --setcaps shadow-utils
With the following run.service file:
[Unit]
Description=Run LLama
[Service]
ExecStart=/locallm/run.sh /locallm/models/llama-2-7b-chat.Q5_K_S.gguf
[Install]
WantedBy=multi-user.target
And the following run.sh file:
#!/bin/bash
MODEL_PATH=${MODEL_PATH:=$1}
if [ ${CONFIG_PATH} ] || [[ ${MODEL_PATH} && ${CONFIG_PATH} ]]; then
python -m llama_cpp.server --config_file ${CONFIG_PATH}
exit 0
fi
if [ ${MODEL_PATH} ]; then
python -m llama_cpp.server --model ${MODEL_PATH} --host ${HOST:=0.0.0.0} --port ${PORT:=8001} --n_gpu_layers ${GPU_LAYERS:=0} --clip_model_path ${CLIP_MODEL_PATH:=None} --chat_format ${CHAT_FORMAT:="llama-2"}
exit 0
fi
echo "Please set either a CONFIG_PATH or a MODEL_PATH"
exit 1
This works when running as an application container, but fails when running as a bootable container, due to completely different SELinux models.
This is the behavior when running as a container with podman run:
● run.service - Run LLama
Loaded: loaded (/etc/systemd/system/run.service; enabled; preset: disabled)
Active: active (running) since Tue 2024-03-26 12:36:44 UTC; 35s ago
Main PID: 53 (python)
Tasks: 16 (limit: 1638)
Memory: 5.5G
CPU: 3.359s
CGroup: /system.slice/run.service
└─53 python -m llama_cpp.server --model /locallm/models/llama-2-7b-chat.Q5_K_S.gguf --host 0.0.0.0 --port 80>
This is the behavior when running via bootc switch:
× run.service - Run LLama
Loaded: loaded (/etc/systemd/system/run.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Tue 2024-03-26 12:58:46 UTC; 9s ago
Duration: 23ms
Process: 665 ExecStart=/locallm/run.sh /locallm/models/llama-2-7b-chat.Q5_K_S.gguf (code=exited, status=203/EXEC)
Main PID: 665 (code=exited, status=203/EXEC)
CPU: 1ms
Mar 26 12:58:46 localhost systemd[1]: Started Run LLama.
Mar 26 12:58:46 localhost systemd[665]: run.service: Failed to locate executable /locallm/run.sh: Permission denied
Mar 26 12:58:46 localhost systemd[665]: run.service: Failed at step EXEC spawning /locallm/run.sh: Permission denied
Mar 26 12:58:46 localhost systemd[1]: run.service: Main process exited, code=exited, status=203/EXEC
Mar 26 12:58:46 localhost systemd[1]: run.service: Failed with result 'exit-code'.
The following shows up in the journal as expected:
Mar 26 13:00:12 ip-172-31-52-150 systemd[919]: run.service: Failed to locate executable /locallm/run.sh: Permission denied
Mar 26 13:00:12 ip-172-31-52-150 systemd[919]: run.service: Failed at step EXEC spawning /locallm/run.sh: Permission denied
Mar 26 13:00:12 ip-172-31-52-150 kernel: audit: type=1400 audit(1711458012.861:6): avc: denied { execute } for pid=919 comm="(run.sh)" name="run.sh" dev="overlay" ino=634 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:default_t:s0 tclass=file permissive=0
Mar 26 13:00:12 ip-172-31-52-150 systemd[1]: run.service: Main process exited, code=exited, status=203/EXEC
Mar 26 13:00:12 ip-172-31-52-150 systemd[1]: run.service: Failed with result 'exit-code'.
Yeah, it's because /somedir (a new toplevel) ends up as default_t which very few domains have access to. Now ideally you could fix this in a derived container build, but this runs into https://github.com/ostreedev/ostree-rs-ext/issues/510 which will get fixed by https://github.com/containers/bootc/pull/215 longer term.
Now...yes. Arguably, we could change policy to e.g. make new unknown toplevels just be usr_t by default; cc @rhatdan ?
(But clearly it needs to be configurable in the container anyways)