sysbox
sysbox copied to clipboard
CIFS-backed volumes don't work inside a Sysbox container with Shiftfs
I have detected a suspicious behavior around CIFS shares: I used portainer UI to create the volume with the following options:
type: cifs
device: //127.0.0.1/dev/legacy
o: username=dev,password=dev,vers=3.1.1,uid=1000,gid=1000,cifsacl,mfsymlinks,cache=none
but the problem is reproduced with default configuration too.
If I mount such share are read-only
, it works fine, when used by a user with ID 1000 inside the container run via sysbox-runc.
However, when I switch to the read-write
mode, I see the following behavior:
- Attempt to create a file is successful(for example
echo 33 > 42.txt
) - Attempt to copy a file from outside of the volume results in
File exists
- Attempt to copy a file from within the volume results in
File exists
- As a result of operation above, file is created anyway but has size = 0. When I repeat copy operation, it succeeds(size matches, content too)
- The issue is not reproduced if I stop all containers that used the volume via
sysbox-runc
(to getshiftfs
off it) and then attach it to a container ran by casual, defaultrunc
This is likely related to shiftfs
but I have no idea how to debug the issue - kernel module seems to have no options whatsoever.
I am on:
cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
and I used latest release of sysbox, portainer and docker.
Any ideas how to deal with it?
Additionally:
# testparm -s
Load smb config files from /etc/samba/smb.conf
WARNING: The "allocation roundup size" option is deprecated
WARNING: The "syslog" option is deprecated
Loaded services file OK.
WARNING: socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=131072 SO_SNDBUF=131072
This warning is printed because you set one of the
following options: SO_SNDBUF, SO_RCVBUF, SO_SNDLOWAT,
SO_RCVLOWAT
Modern server operating systems are tuned for
high network performance in the majority of situations;
when you set 'socket options' you are overriding those
settings.
Linux in particular has an auto-tuning mechanism for
buffer sizes (SO_SNDBUF, SO_RCVBUF) that will be
disabled if you specify a socket buffer size. This can
potentially cripple your TCP/IP stack.
Getting the 'socket options' correct can make a big
difference to your performance, but getting them wrong
can degrade it by just as much. As with any other low
level setting, if you must make changes to it, make
small changes and test the effect before making any
large changes.
Server role: ROLE_STANDALONE
# Global parameters
[global]
allow insecure wide links = Yes
dns proxy = No
log file = /var/log/samba/log.%m
map to guest = Bad User
max log size = 1000
min receivefile size = 16384
obey pam restrictions = Yes
pam password change = Yes
panic action = /usr/share/samba/panic-action %d
passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
passwd program = /usr/bin/passwd %u
server role = standalone server
server signing = No
server string = %h server (Samba, Ubuntu)
socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=131072 SO_SNDBUF=131072
syslog = 0
unix password sync = Yes
usershare allow guests = Yes
idmap config * : backend = tdb
aio read size = 16384
aio write size = 16384
allocation roundup size = 4096
strict locking = No
use sendfile = Yes
[homes]
browseable = No
comment = Home Directories
force user = %S
inherit acls = Yes
inherit permissions = Yes
map acl inherit = Yes
map archive = No
read only = No
valid users = %S
vfs objects = acl_xattr
wide links = Yes
[printers]
browseable = No
comment = All Printers
create mask = 0700
path = /var/spool/samba
printable = Yes
[print$]
comment = Printer Drivers
path = /var/lib/samba/printers
Hi @AlexTalker , thanks for giving Sysbox a shot and for filing this issue.
I've not played around with CIFS volume mounts into Sysbox containers, but certainly shiftfs could be playing a role. In theory it should not since shiftfs is a thin overlay, but we have to investigate.
A couple of questions to help me debug:
-
Can you provide the output of
findmnt
inside the container? This will help me see the CIFS mount and whether shiftfs is mounted on top of it or not. -
In order to repro on my side, is it as simple as creating a CIFS volume with Docker and mounting it into the container? I know you used the portainer UI to create the volume, I am wondering if you have the command line instructions to do so.
Thanks!
I believe, all portainer does in UI is just provides self-explanatory fields to fill in for simple CIFS/NFS mount(host, username, password, proto), when I went a little advanced(due to wish of mapping in UNIX rights since I share from UNIX to UNIX), I went straight to specifying driver options just the same way I suppose as you do with Compose or just Docker CLI, that's why I enlisted the details in the beginning.
[# cat /tmp/findmnt.txt
TARGET SOURCE FSTYPE OPTIONS
/ . shiftfs rw,relatime
├─/sys sysfs sysfs rw,nosuid,nodev,noexec,relatime
│ ├─/sys/firmware tmpfs tmpfs ro,relatime,uid=493216,gid=493216
│ ├─/sys/fs/cgroup tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,mode=755,uid=493216,gid=493216
│ │ ├─/sys/fs/cgroup/systemd systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd
│ │ ├─/sys/fs/cgroup/pids cgroup cgroup rw,nosuid,nodev,noexec,relatime,pids
│ │ ├─/sys/fs/cgroup/rdma cgroup cgroup rw,nosuid,nodev,noexec,relatime,rdma
│ │ ├─/sys/fs/cgroup/cpu,cpuacct cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
│ │ ├─/sys/fs/cgroup/cpuset cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpuset
│ │ ├─/sys/fs/cgroup/net_cls,net_prio cgroup cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
│ │ ├─/sys/fs/cgroup/hugetlb cgroup cgroup rw,nosuid,nodev,noexec,relatime,hugetlb
│ │ ├─/sys/fs/cgroup/perf_event cgroup cgroup rw,nosuid,nodev,noexec,relatime,perf_event
│ │ ├─/sys/fs/cgroup/memory cgroup cgroup rw,nosuid,nodev,noexec,relatime,memory
│ │ ├─/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,freezer
│ │ ├─/sys/fs/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relatime,devices
│ │ └─/sys/fs/cgroup/blkio cgroup cgroup rw,nosuid,nodev,noexec,relatime,blkio
│ ├─/sys/kernel/config tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,size=1024k,uid=493216,gid=493216
│ ├─/sys/kernel/debug tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,size=1024k,uid=493216,gid=493216
│ ├─/sys/kernel/tracing tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,size=1024k,uid=493216,gid=493216
│ └─/sys/module/nf_conntrack/parameters/hashsize sysboxfs[/sys/module/nf_conntrack/parameters/hashsize] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
├─/proc proc proc rw,nosuid,nodev,noexec,relatime
│ ├─/proc/bus proc[/bus] proc ro,relatime
│ ├─/proc/fs proc[/fs] proc ro,relatime
│ ├─/proc/irq proc[/irq] proc ro,relatime
│ ├─/proc/sysrq-trigger proc[/sysrq-trigger] proc ro,relatime
│ ├─/proc/asound tmpfs tmpfs ro,relatime,uid=493216,gid=493216
│ ├─/proc/acpi tmpfs tmpfs ro,relatime,uid=493216,gid=493216
│ ├─/proc/keys udev[/null] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/proc/timer_list udev[/null] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/proc/sched_debug udev[/null] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/proc/scsi tmpfs tmpfs ro,relatime,uid=493216,gid=493216
│ ├─/proc/swaps sysboxfs[/proc/swaps] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
│ ├─/proc/sys sysboxfs[/proc/sys] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
│ └─/proc/uptime sysboxfs[/proc/uptime] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
├─/dev tmpfs tmpfs rw,nosuid,size=65536k,mode=755,uid=493216,gid=493216
│ ├/dev/console devpts[/0] devpts rw,nosuid,noexec,relatime,gid=493221,mode=620,ptmxmode=666
│ ├─/dev/mqueue mqueue mqueue rw,nosuid,nodev,noexec,relatime
│ ├─/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=493221,mode=620,ptmxmode=666
│ ├─/dev/shm shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k,uid=493216,gid=493216
│ ├─/dev/kmsg udev[/null] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/dev/null udev[/null] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/dev/random udev[/random] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/dev/full udev[/full] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/dev/tty udev[/tty] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ ├─/dev/zero udev[/zero] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
│ └─/dev/urandom udev[/urandom] devtmpfs rw,nosuid,noexec,relatime,size=8118552k,nr_inodes=2029638,mode=755
├─/home/dev /var/lib/docker/volumes/UserLegacyTest/_data shiftfs rw,relatime
├─/etc/resolv.conf /var/lib/docker/containers/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5[/resolv.conf] shiftfs rw,relatime
├─/etc/hostname /var/lib/docker/containers/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5[/hostname] shiftfs rw,relatime
├─/etc/hosts /var/lib/docker/containers/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5[/hosts] shiftfs rw,relatime
├─/var/lib/docker /dev/sda5[/var/lib/sysbox/docker/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5] ext4 rw,relatime,errors=remount-ro,stripe=32730
├─/var/lib/kubelet /dev/sda5[/var/lib/sysbox/kubelet/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5] ext4 rw,relatime,errors=remount-ro,stripe=32730
└─/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs /dev/sda5[/var/lib/sysbox/containerd/435e77513d711c315854bd100c1eb9e0cf6dde6bc89f0a8c617408918e53a4f5] ext4 rw,relatime,errors=remount-ro,stripe=32730
Last time I used official Arch
image for the test.
Also, as you might noticed, I share a folder from a home directory of a user(dev
) which in the system itself has UID=1006
but since container user has UID=1000
, I specify it instead of root/root
default behavior.
As I stated before, read-only
mode works quite okay for my purposes but read-write
seems to mess things up somewhere.
Just now, I intially tried to write output of findmnt
on the volume too and it didnt game errors(the shell redirect >
) but file size was 0 anyway, especially on the original FS(ext4
). Strange.
-rwxrwxr-x+ 1 dev dev 1.0M Jan 13 18:36 test
-rwxrwxr-x+ 1 dev dev 0 Jan 13 18:57 test2
-rwxrwxr-x+ 1 dev dev 0 Jan 13 19:01 test3
-rwxrwxr-x+ 1 dev dev 0 Jan 13 19:43 test4
This is view from ext4
, all the "copied" files have size 0. Docker tricked me into thinking it succeeded after file creation but the reality is even more disappointing :(
After literally just switching runtime
for the container(which I think means that Portainer re-creates container), it does work as expected:
-rwxrwxr-x+ 1 dev dev 1.0M Jan 13 18:36 test
-rwxrwxr-x+ 1 dev dev 0 Jan 13 18:57 test2
-rwxrwxr-x+ 1 dev dev 0 Jan 13 19:01 test3
-rwxrwxr-x+ 1 dev dev 0 Jan 13 19:43 test4
-rwxrwxr-x+ 1 dev dev 1.0M Jan 13 19:47 test5
# findmnt
TARGET SOURCE FSTYPE OPTIONS
/ overlay overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/OGJIWCE7ASZ7ISKXGCQD7EZ4HD:/var/lib/docker/overlay2/l/QJCG
├─/proc proc proc rw,nosuid,nodev,noexec,relatime
│ ├─/proc/bus proc[/bus] proc ro,relatime
│ ├─/proc/fs proc[/fs] proc ro,relatime
│ ├─/proc/irq proc[/irq] proc ro,relatime
│ ├─/proc/sys proc[/sys] proc ro,relatime
│ ├─/proc/sysrq-trigger proc[/sysrq-trigger] proc ro,relatime
│ ├─/proc/asound tmpfs tmpfs ro,relatime
│ ├─/proc/acpi tmpfs tmpfs ro,relatime
│ ├─/proc/kcore tmpfs[/null] tmpfs rw,nosuid,size=65536k,mode=755
│ ├─/proc/keys tmpfs[/null] tmpfs rw,nosuid,size=65536k,mode=755
│ ├─/proc/timer_list tmpfs[/null] tmpfs rw,nosuid,size=65536k,mode=755
│ ├─/proc/sched_debug tmpfs[/null] tmpfs rw,nosuid,size=65536k,mode=755
│ └─/proc/scsi tmpfs tmpfs ro,relatime
├─/dev tmpfs tmpfs rw,nosuid,size=65536k,mode=755
│ ├─/dev/console devpts[/0] devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666
│ ├─/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666
│ ├─/dev/mqueue mqueue mqueue rw,nosuid,nodev,noexec,relatime
│ └─/dev/shm shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k
├─/sys sysfs sysfs ro,nosuid,nodev,noexec,relatime
│ ├─/sys/firmware tmpfs tmpfs ro,relatime
│ └─/sys/fs/cgroup tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,mode=755
│ ├─/sys/fs/cgroup/systemd cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,xattr,name=systemd
│ ├─/sys/fs/cgroup/pids cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,pids
│ ├─/sys/fs/cgroup/rdma cgroup cgroup ro,nosuid,nodev,noexec,relatime,rdma
│ ├─/sys/fs/cgroup/cpu,cpuacct cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,cpu,cpuacct
│ ├─/sys/fs/cgroup/cpuset cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,cpuset
│ ├─/sys/fs/cgroup/net_cls,net_prio
│ │ cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,net_cls,net_prio
│ ├─/sys/fs/cgroup/hugetlb cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,hugetlb
│ ├─/sys/fs/cgroup/perf_event cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,perf_event
│ ├─/sys/fs/cgroup/memory cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,memory
│ ├─/sys/fs/cgroup/freezer cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,freezer
│ ├─/sys/fs/cgroup/devices cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ │ cgroup ro,nosuid,nodev,noexec,relatime,devices
│ └─/sys/fs/cgroup/blkio cgroup[/docker/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928]
│ cgroup ro,nosuid,nodev,noexec,relatime,blkio
├─/home/dev //127.0.0.1/dev/legacy[/legacy] cifs rw,relatime,vers=3.0,cache=strict,username=dev,uid=0,noforceuid,gid=0,noforcegid,addr=127.0.0.1,file_mode=
├─/etc/resolv.conf /dev/sda5[/var/lib/docker/containers/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928/resolv.conf]
│ ext4 rw,relatime,errors=remount-ro,stripe=32730
├─/etc/hostname /dev/sda5[/var/lib/docker/containers/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928/hostname]
│ ext4 rw,relatime,errors=remount-ro,stripe=32730
└─/etc/hosts /dev/sda5[/var/lib/docker/containers/85b78a1ac648435a8b50425f7fd189c3f00423b4c55e921d752ccc4b23128928/hosts]
ext4 rw,relatime,errors=remount-ro,stripe=32730
Hi @AlexTalker, thanks again for all the info provided.
I was able to reproduce the problem, and it certainly appears to be caused by the interaction between shiftfs and cifs (shiftfs is acting as a thin overlay on top of cifs).
Unfortunately the kernel log (dmesg
) did not provide much info, except for the following:
[1994479.821465] CIFS VFS: cifs_invalidate_mapping: could not invalidate inode 000000004fa38d62
A similar problem was spotted last year on LXD (which also uses shiftfs): https://github.com/lxc/lxd/issues/6590.
Solving this will require going down into kernel space to figure out what's causing the bad interaction between these filesystems. Unfortunately I don't have the cycles to do this right now (due to other priorities).
In the meantime, a work-around in order to mount a cifs-backed volume into a Sysbox container would be to configure Docker in userns-remap mode. This way Sysbox won't need to use shiftfs anymore.
If you want to do this, add the "userns-remap" line to the /etc/docker/daemon.json
file:
cat /etc/docker/daemon.json
{
"userns-remap": "sysbox",
"runtimes": {
"sysbox-runc": {
"path": "/usr/local/sbin/sysbox-runc"
}
},
"default-address-pools": [
{
"base": "172.80.0.0/16",
"size": 24
}
],
"bip": "172.20.0.1/16"
}
Then restart Docker with systemctl restart docker
. And then create the container with Docker + Sysbox as usual.
One caveat: if you decide to use userns-remap, then the CIFS mount must be configured with subuid:subgid that matches the ones associated with Sysbox. Otherwise the files will show up as nobody:nogroup
inside the container.
For example, on my machine, Sysbox is associated with the following subuid 165536:
$ cat /etc/subuid | grep sysbox
sysbox:165536:65536
Thus, I had to mount the cifs share as follows:
sudo mount -t cifs -o username="cesar",uid=165536,gid=165536 //10.0.0.48/sambashare /mnt/winshare
Then I launched the container with:
docker run --runtime=sysbox-runc --rm -it -v /mnt/winshare:/mnt/winshare nestybox/ubuntu-focal-systemd-docker
And inside the container the cifs share is mounted properly:
# findmnt | grep cifs
|-/mnt/winshare //10.0.0.48/sambashare cifs rw,relatime,vers=3.1.1,cache=strict,username=cesar,uid=165536,forceuid,gid=165536,forcegid,addr=10.0.0.48,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1
and the files have the appropriate ownership:
root@200e385f451d:~# ls -l /mnt/winshare/
total 2048
-rwxr-xr-x 1 root root 9 Jan 13 22:56 test
-rwxr-xr-x 1 root root 6 Jan 13 22:56 test2
After this everything works normally.
By the way, there is work happening at kernel level that will void the need for shiftfs in the near future. This will likely fix this issue and void the need for the work-around I described.
@ctalledo Thanks for looking into it for me. You see, that's exactly what bothers me - I do not want to have volume mapped under root rights in docker since I dont wanna scare off some utils I am about to use there potentially.
Is it possible to somehow add additional map so that files can be mounted under UID of container user?
@ctalledo I spotted such record in /etc/subuid
:
dev:558752:65536
will it suffice my needs then if I supply 558... in mounting params? Or it will still be nobody?
Hi @AlexTalker,
Since Sysbox uses the Linux user-namespace for its containers, there is mapping of user-IDs going on.
Assuming that at host level:
- You've configured Docker with
userns-remap: "sysbox"
- And files "/etc/subuid" and "/etc/subgid" have an entry such as
sysbox:165536:65536
Then inside the container:
- User 0 (Root) = host user 165536
- User 1000 = host user 165536 + 1000
Thus, say you want the cifs volume to appear inside the container as owned by user 1000. Then at host level you would create the cifs mount with uid:gid 165536+1000 = 166536. E.g.,
sudo mount -t cifs -o username="cesar",uid=166536,gid=166536 //10.0.0.48/sambashare /mnt/winshare
and then create the container as usual:
docker run --runtime=sysbox-runc --rm -it -v /mnt/winshare:/mnt/winshare nestybox/ubuntu-focal-systemd-docker
Does that answer your question?
Note that the /etc/subuid
file inside the container has no bearing on this. It's the /etc/subuid
file at host level you care about.
@ctalledo If I understand correctly, does /etc/subuid
act as "slice off" of IDs then, if you state that such math works?
If so, how does one limit how many IDs are available in the container?
Gonna try your trick tomorrow, see what I get.
Also, in this case, do casual containers still can function "normally"? I mean, is "userns-remap" only matters for sysbox-runc
?!
@ctalledo If I understand correctly, does
/etc/subuid
act as "slice off" of IDs then, if you state that such math works?
Correct.
If so, how does one limit how many IDs are available in the container?
Sysbox assigns a range of 65536 UIDs to the container. It takes these from the slice associated with user "sysbox" in /etc/subuid
.
Also, in this case, do casual containers still can function "normally"? I mean, is "userns-remap" only matters for sysbox-runc?!
Docker userns-remap applies to all Docker containers, even those deployed with the default OCI runc runtime. This improves container isolation (root in the container is not root on the host), but does have some limitations (see [here]).(https://docs.docker.com/engine/security/userns-remap/).
In general we prefer that Docker remain in regular mode, but this requires Sysbox to use shiftfs, which is mostly fine, though you found that shiftfs-on-cifs is not working properly (unfortunately).
@ctalledo Thanks for the explanation, could you please also highlight whether or not I need to change owndership in /var/lib/docker
to make it work and/or restart respective sysbox
services?! Or docker restart & image re-setup is enough?
Hi @AlexTalker,
could you please also highlight whether or not I need to change owndership in /var/lib/docker to make it work
No this should not be needed, Docker takes care of it (Docker is the sole manager of that directory, so if we are changing ownership there something is off).
and/or restart respective sysbox services?! Or docker restart & image re-setup is enough?
No need to restart Sysbox when configuring Docker in userns-remap. The Docker restart is enough. Just make sure all containers are stopped/removed before switching Docker to userns-remap mode.