storage icon indicating copy to clipboard operation
storage copied to clipboard

Confusing features, and locking semantics, of Mount/Unmount

Open mtrmac opened this issue 3 years ago • 5 comments

Currently, both store.Mount and store.Unmount only work with the primary layer store, and lock it for writing: https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/store.go#L2745 https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/store.go#L2850 as of e84b2649927d339254fe80bf87aee0ec32a6a9d7 .

Meanwhile, Diff reads all layer stores, and locks for reading: https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/store.go#L2940 as of 68d65106a07f84b44d2dce24d9d9bfde7313832c .

But the two are related: Diff can trigger mounts/unmounts via https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/layers.go#L1474 (although we don’t usually see that, because several graph drivers, notably overlay, implement DiffGetter


Note that Diff is, AFAICS, the only way to possibly trigger a mount/unmount from an additional layer store.

Meanwhile, Mount and Unmount nowadays reject attempts to modify them from read-only (= additional) stores: https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/layers.go#L950 https://github.com/containers/storage/blob/7de6744c9fdc616b44539703c5831ed0f328652c/layers.go#L1000 as of 0183a293dc01f22561b83448340c08e55f9b979b .

More curiously, Mount has an exception for read-only stores as of 0ef62eeea30955a62b543e11830257b3293c8ac5 , but

  • Unmount doesn’t have a corresponding exception
  • The only caller that could possibly benefit, Diff, does not set the “read-only” option, so it doesn’t fit that exception at all.

Questions:

  • Does mounting require the layer store to be locked read-only or read-write? Either Diff is incorrect, or Mount/Unmount is locking excessively. At least 0183a293dc01f22561b83448340c08e55f9b979b suggests that the intent was to make read-only layer store locks to be sufficient.
  • Can the “allow read-only mounts from read-only stores” exception be reached in the current codebase? And if not, is it desirable for it to be reachable?

@nalind @vrothberg @rhatdan ?

mtrmac avatar Oct 08 '22 03:10 mtrmac

Looking into this further, any mount/unmount call can modify layerStore-owned Layer.MountCount, which requires (per #1332 ) an in-process write lock against other readers. So at least the fully read-only access of Diff is insufficient (but it might be plausible to only hold an in-process exclusive lock, while keeping the filesystem locked for reading only).

mtrmac avatar Oct 14 '22 01:10 mtrmac

Note, to self and others: Carefully distinguish

  • read-only vs. read-write stores:lockfile.IsReadWrite is true for the primary layer store, false for additional stores. This is a constant (for the lifetime of the process, because the bit is stored in the per-process lock file… so at least for the lifetime of the part of the process that uses the same c/storage vendored codebase)
  • read-only vs. read-write cross-process access: Whether the lock file is locked for reading or writing (always read-only for additional stores). Or perhaps whether the mount lock file is locked for reading or writing.
  • read-only vs. read-write in-process access, governing data races when accessing memory.

It’s just way too much to keep track of, especially without explicit documentation and language help (#1389 ).

mtrmac avatar Nov 12 '22 21:11 mtrmac

any mount/unmount call can modify layerStore-owned Layer.MountCount, which requires (per #1332 ) an in-process write lock against other readers.

#1346 adds an in-process lock to synchronize these accesses (and eliminates some of the other accesses); but that doesn’t fix the Diff paths, which are insufficiently synchronized.

(Nothing at all changes about the unclear ability to mount layers from additional stores.)

mtrmac avatar Nov 14 '22 20:11 mtrmac

Podman mount and unmount are attempting to count how many times an image is mounted, thus the count. If I run

# mnt=$(podman mount alpine)
# podman run -d alpine sleep 200

This should end up with 2 mounts of the image. When the sleep completes it will go to 1. and if I do

# podman unmount alpine

Down to zero. And the overlay mount can actually be unmounted. This count can get out of whack if apps crash or the system crashes without decrementing the mount count to 0. (Unless the mount count is done on a tmpfs?)

I don't believe this is a solved problem, and I think we have some flakes in podman CI system, caused by miscounting the mounts.

rhatdan avatar Nov 15 '22 14:11 rhatdan

I’m not at all arguing against the existence of the mount count (and yes, it is stored in the “RunDir”).

This issue primarily about the ability / inability / correctness of mounting layers from additional stores, and the correctness of the concurrency implementation.

mtrmac avatar Nov 15 '22 14:11 mtrmac