ostree and toplevel directories

ostree sets the immutable bit (chattr +i on /); the rationale is to keep all state in /etc and /var. However, some use cases have needs for toplevel mount directores for compatibility.

Today, a workaround is to do chattr -i / as part of a systemd unit for example on early boot:

[Unit]
DefaultDependencies=no
After=local-fs-pre.target
[Service]
ExecStart=chattr -i /
Type=oneshot
RemainAfterExit=yes

(Then be sure to order any .mount units after this unit)

This is a subset of https://github.com/projectatomic/rpm-ostree/issues/233

For this, all we need to do is take new empty toplevel directories from the RPM content so that one can use them as mount points.

Jun 21 '16 15:06 cgwalters

@cgwalters would this help the vagrant use case where I want to create an empty toplevel directory /vagrant/ and mount a network filesystem to that location?

Mar 21 '17 15:03 dustymabe

Yes.

May 18 '17 21:05 cgwalters

Another option is to support specifying them in the treefile, but that gets ugly.

May 18 '17 21:05 cgwalters

Not entirely the same thing as your original description and probably harder to pull off, but....

it would be nice to be able to dynamically create a top level mountpoint that can be used on an atomic host. one example use case for this if vagrant-sshfs where the user can specify where they want the mountpoint to be inside the vagrant box. I tend to use /sharedfolder/ a lot.

Would be acceptable if i first had to run a command to enable the top level empty mount point, before attempting to mount it.

May 18 '17 22:05 dustymabe

Today one can chattr -i /. Though a tricky thing I suspect is that since vagrant-sshfs runs before provisioning, that logic would have to be part of vagrant-sshfs itself.

That said...it's worth backing up and looking at the rationale for the ostree design here; basically, as a system administrator, you can know that all of the system state is underneath /etc and /var. You have just two places to back up. Backup systems don't have to e.g. traverse / but exclude /proc and /sys, etc.

On the flip side, the rationale is weaker if you have network mounts involved - you probably don't want to backup remote mounts.

One thing to note with the chattr approach is that the directory won't persist across upgrades. But on the other hand if you just mkdir it on boot and remount it, that's not a problem. Basically, toplevel network mounts are OK.

May 19 '17 13:05 cgwalters

are there any unknown risks involved with using chattr -i / on atomic host? Would there be any problem with me adding that as a feature in vagrant-sshfs? something like:

if want to mount a directory under `/`; then
    chattr -i /

May 19 '17 21:05 dustymabe

You're asking about known unknowns I guess? :smile: Not aware of any offhand. The only ting I can think of is it will open other processes to create stuff there too, but...eh. As far as implementation; I would do something like:

if want mount in /; then
   if test -f /run/ostree-booted; then
     chattr -i / || true
  fi
fi

May 19 '17 21:05 cgwalters

One thing to note with the chattr approach is that the directory won't persist across upgrades.

just realized this.. would it be worth it to have a config file where a user can configure empty top level mounts and a systemd unit that creates these mounts (defined in the config file) on boot?

Jun 12 '17 02:06 dustymabe

Another usecase is Nix package manager which requires /nix mountpoint

Feb 02 '19 20:02 redbaron

Also stepped into same issue and reported as https://lists.fedoraproject.org/archives/list/[email protected]/thread/F2D2TMYJFRU3H24RJJNHCN3QGJPACXXU/

Once we drop in Fedora Container Linux in between all the existing CoreOS systems that have their data mount at some root folder it would cause us lots of extra work to distinguish system with mount in root folder and mount at var/lib/somefolder (background : we have lots of APIs that talk to all these systems via docker remote API and pass volume mounts to the container start calls assuming the data folder is /dockerdata)

All we need is the empty folder in root to mount disks at that place. Would it be possible to allow directories that have some ignition config attribute immutable

Like

Storage directories: — path: /dockerdata Immutable: true

and than allow to create these folders with attr +i

Would do the job for our usescase..

Sep 09 '19 05:09 HeikoOnnebrink

just realized this.. would it be worth it to have a config file where a user can configure empty top level mounts and a systemd unit that creates these mounts (defined in the config file) on boot?

We could do this with a fcct sugar perhaps; the main thing I think is requiring that these directories be mount points, because we don't want people to lose data.

Mar 23 '20 17:03 cgwalters

The workaround I've implemented is creating a mount-prepare.service unit that does what @dustymabe suggests:

[Unit]
Description=Prepare mount points
Before=remote-fs-pre.target
Wants=remote-fs-pre.target

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=/bin/sh -c "[ -d '{{ mount_point }}' ] || mkdir -p '{{ mount_point }}'"
ExecStopPost=chattr +i /

[Install]
WantedBy=remote-fs.target

I guess this, and the appropriate symlink (/etc/systemd/system/remote-fs.target.wants/mount-prepare.service -> /etc/systemd/system/mount-prepare.service) could be provisioned by ignition (I'm using ansible).

They should survive reboot & Zincati upgrade.

Apr 27 '20 15:04 paolope

However, I have a question (@lucab maybe can help?); suppose the mounts are in use, would Zincati/OSTree try to wipe their contents to bring the filesystem to the required state during an upgrade?

Apr 27 '20 15:04 paolope

Zincati/OSTree try to wipe their contents to bring the filesystem to the required state during an upgrade?

EDIT: Currently...no, ostree will only delete content from non-booted deployments, which won't have it mounted. But to be extra safe, you should instead do something like this:

ExecStart=/bin/sh -c "[ -L '/{{ mount_point }}' ] || ln -sr '/var/mnt/{{ mount_point }}' '/{{ mount_point }}"

i.e. the only thing that exists in / is a symlink to /var/mnt (or something underneath /var anyways).

That way clients can also access the mount point via an OSTree-compatible path.

Or to rephrase, keep your state in /var and we're just doing this "symlink in /" for backcompat with older clients.

Apr 27 '20 16:04 cgwalters

is there a reason this hasn't gotten merged?

Dec 23 '21 02:12 jorhett

@jorhett this is a bug report and not a PR, so it can't really be "merged". Are you maybe looking at some specific PR? Which top-level missing directory concerns you? What's your usecase?

Dec 23 '21 09:12 lucab

Sorry on my choice of language. Was just trying to understand if this need has been tackled yet, or why not?

Dec 23 '21 23:12 jorhett

@paolope nice unit file. :) I modified it a bit to be more portable:

can be installed either manually as /etc/systemd/system/[email protected] or via systemctl edit --force --full '[email protected]'

[Unit]
Description=Prepare mount points
Before=remote-fs-pre.target
Wants=remote-fs-pre.target

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=/bin/sh -c "[ -d '%f' ] || mkdir -p '%f'"
ExecStopPost=chattr +i /

[Install]
WantedBy=remote-fs.target

Then you can enable one service for every directory that needs to be created. For example systemctl enable mount-prepare@foo will create a /foo directory, systemctl enable mount-prepare@foo-bar would create /foo/bar

Ninja-Edit: actually, is why remote-fs is used instead of local-fs? ostree-mount.service is running Before local-fs.target so it should be fine to use local-fs.target and just declare After=ostree-remount?

Ninja-Edit 2:

So i needed that for snap. My solution now looks like this:

[pi@rpi ~]$ systemctl cat mkdir-rootfs@
# /etc/systemd/system/[email protected]
[Unit]
Description=Enable mount points in / for ostree
DefaultDependencies=no
ConditionPathExists=!%f

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=mkdir -p '%f'
ExecStopPost=chattr +i /

[pi@rpi ~]$ systemctl cat snap.mount
# /etc/systemd/system/snap.mount
[Unit]
[email protected]
[email protected]
Before=snapd.socket

[Mount]
What=/var/lib/snapd/snap
Where=/snap
Options=bind
Type=none

[Install]
WantedBy=snapd.socket

Works like a charm. :-)

Dec 24 '21 19:12 BreiteSeite

Strawman: Add new sysroot.toplevel-dirs knob of type string list. Teach libostree to read this knob and create the toplevel dirs whenever a new deployment is created.

With a read-only sysroot (which is the default in FCOS, and soon will be in FSB), this can only be really useful as a mountpoint, ensuring that users aren't able to store data directly into the deployment root (which would get lost). Edit: But actually, the deployment root is currently still mounted writable but I think we could make it read-only. But probably simpler for this to just also add the immutable bit on those dirs too.

Edit: first boot would have to be special-cased since the deployment would already exist; or maybe simplest we handle this in ostree-prepare-root like we do sysroot.read-only?

Jul 12 '22 01:07 jlebon

https://github.com/ostreedev/ostree/pull/2681 adds support for top-level symlinks. https://github.com/coreos/fedora-coreos-config/pull/1879 adds support for top-level symlinks on first boot configured via Ignition.

Jul 29 '22 22:07 jlebon

Please bear in mind that symlinks won't fix the problem.

One specific use case here is to install nix. It tries to achieve build purity everywhere. If /nix is a symlink, then it can yield impure results.

See the docs about it: https://nixos.org/manual/nix/stable/command-ref/env-common.html#env-NIX_IGNORE_SYMLINK_STORE

Jul 30 '22 06:07 yajo

Yup, the approach supports directories and it wouldn't be hard to add them. The reason I didn't for now is that it's harder to support in the downstreams I work on (see commit message). I think it's surmountable though. Regardless of what we do there, it'd make sense to support in libostree.

Jul 30 '22 13:07 jlebon

I think I mentioned this elsewhere too but: Another thing that might make sense is to switch to a tmpfs for / by default; we'd mount in /usr, /etc and /var, and create copies of the "base stuff" from the rootfs (e.g. /proc, the /lib -> /usr/lib symlink etc.). We'd probably still have an immutable bit on / by default but anyone who wanted to create empty toplevel directories and such could just chattr -i / and use systemd mount units to create directories there on boot.

We could also add an option to just turn off the immutable bit by default for admins who Know What They're Doing.

Oct 20 '23 14:10 cgwalters

Just for information, the solution in https://github.com/coreos/rpm-ostree/issues/337#issuecomment-1000923022 is racy (see https://github.com/containers/podman/pull/20612).

Until there is proper support in rpm-ostree/ostree, you have two options:

Setup all directories in a single unit
Create 3 units that are strictly ordered one after the others:
- the first unit does the chattr -i
- then the second unit is a template unit that does the mkdir for each mount point
- then the last does the chattr +i

Nov 08 '23 17:11 travier

Note that https://github.com/ostreedev/ostree/pull/3114 effectively adds support for this (among other things)

Jan 10 '24 18:01 cgwalters

For custom edge image builds, I think it's just a matter of creating the directories for each target mount point specified as part of the ostree (image) build (instead of trying to create them at runtime).

Jan 10 '24 18:01 cgwalters

rpm-ostree
rpm-ostree copied to clipboard

Support empty toplevel mount points

ostree and toplevel directories

rpm-ostree rpm-ostree copied to clipboard

Support empty toplevel mount points

ostree and toplevel directories

rpm-ostree
rpm-ostree copied to clipboard