watcher On linux with python lib, when umounting the fs the monitored path is within, no events are generated

umount test

mount -t tmpfs tmpfs test

touch hi

-> no events generated

maybe inotify lib behavior? maybe the watchers monitors the inode instead of the path.

I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?

This also happens when a fs got mounted to a path / subfolder the program is watching on.

mkdir subdir

touch subdir/hi

-> events generated

mount -t tmpfs tmpfs subdir

touch subdir/hi2

-> no events generated

Jan 29 '25 08:01 yufei-pan

thinking and looking. might not be able to get around to this one today.

Feb 01 '25 22:02 e-dant

One thing we could do is try to create new inotify instances per-filesystem. (As a workaround, this would be somewhat equivalent to launching new instances of watcher on a per-filesystem basis.)

I don't see how to get more information out of inotify any other way.

I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?

I don't know how.

Testing with:

[ -d /tmp/wtr/mnt ] || mkdir -p /tmp/wtr/mnt ; cd /tmp/wtr ; sudo mount -t tmpfs tmpfs mnt ; touch mnt/a ; ll mnt ; sudo umount mnt

I see nothing extra from inotify when running with the following patch. No new events. We don't appear to be missing something on the release branch.

diff --git a/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp b/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
index ac861b9..d1d3ac2 100644
--- a/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
+++ b/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
@@ -78,13 +78,7 @@ struct ke_in_ev {
         IN_OPEN
   */
   static constexpr unsigned recv_mask
-    = IN_CREATE
-    | IN_DELETE
-    | IN_DELETE_SELF
-    | IN_MODIFY
-    | IN_MOVE_SELF
-    | IN_MOVED_FROM
-    | IN_MOVED_TO;
+    = IN_ALL_EVENTS;

   int fd = -1;
   using paths = std::unordered_map<int, std::filesystem::path>;
@@ -328,6 +322,38 @@ inline auto do_ev_recv = [](auto const& cb, sysres& sr) -> result
     return has_any && ! is_self_info;
   };

+  auto dbg_inotify_event = [](inotify_event const* const in)
+  {
+    if (! in) {
+      fprintf(stderr, "inotify_event: null\n");
+      return;
+    }
+    auto msk = in->mask;
+    auto isdir = msk & IN_ISDIR;
+    auto iscreate = msk & IN_CREATE;
+    auto isdelete = msk & IN_DELETE;
+    auto ismodify = msk & IN_MODIFY;
+    auto ismove = msk & IN_MOVE;
+    auto isself = msk & IN_MOVE_SELF;
+    auto isfrom = msk & IN_MOVED_FROM;
+    auto isto = msk & IN_MOVED_TO;
+    fprintf(stderr, "inotify_event (msk=%s%s%s%s%s%s%s%s) {\n",
+      isdir ? "d" : ".",
+      iscreate ? "c" : ".",
+      isdelete ? "d" : ".",
+      ismodify ? "m" : ".",
+      ismove ? "m" : ".",
+      isself ? "s" : ".",
+      isfrom ? "f" : ".",
+      isto ? "t" : ".");
+    fprintf(stderr, "  int wd: %d\n", in->wd);
+    fprintf(stderr, "  uint32_t mask: %u\n", in->mask);
+    fprintf(stderr, "  uint32_t cookie: %u\n", in->cookie);
+    fprintf(stderr, "  uint32_t len: %u\n", in->len);
+    fprintf(stderr, "  char name: %s\n", in->name);
+    fprintf(stderr, "}\n");
+  };
+
   auto read_len = read(sr.ke.fd, sr.ke.ev_buf, sizeof(sr.ke.ev_buf));
   if (read_len < 0 && errno != EAGAIN)
     return result::e_sys_api_read;
@@ -337,6 +363,7 @@ inline auto do_ev_recv = [](auto const& cb, sysres& sr) -> result
     unsigned in_ev_c = 0;
     auto dmrm = defer_dm_rm_wd{sr.ke};
     while (in_ev && in_ev < in_ev_tail) {
+      dbg_inotify_event(in_ev);
       auto in_ev_next = peek(in_ev, in_ev_tail);
       unsigned msk = in_ev->mask;
       if (in_ev_c++ > ke_in_ev::c_ulim)

Feb 01 '25 23:02 e-dant

I don't see how to get more information out of inotify any other way.

I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?

I might be wrong about inotify noticing the umount events. It could be fanotify.

I have used the cli tool inotifywait from inotiy-tools and it does appears will notify the user with umount events.

https://www.man7.org/linux/man-pages/man1/inotifywait.1.html#EVENTS

inotifywait will use fanotify in kernel > 5.9 and I am running the program in a new kernel with root. Thus the umount event can be generated by fanotify instead.

Originally proposed this as it will be helpful so automated tools will potentially know when to terminate the watcher / restart it as the watcher does not persist accross fs mount / umount events.

For example, if monitoring a soft dynamically mounted network file system that is unreliable and often get disconnected, might be a good idea to inform downstream tools that monitoring have stopped. As automatic operations should not be done when it is disconnected / should auto restart after the mount get recovered. Currently it looks like there need to be significantly more logic involved when trying to detect and treat these events. Thus potentially the caller should handle this in OS - specific way.

Feb 02 '25 04:02 yufei-pan

AFAICT there are two ways to do this with fanotify. (ebpf could also do this in different ways.)

Both of our solutions need to know whether or not some given path is actually a mount point.

The relevant documentation for fanotify around FAN_MARK_MOUNT suggests that we can just OR that with our regular FAN_MARK_ADD here (and temporarily removing that is_dir check for testing this out).

Unfortunately, that does not seem to be the case. fanotify_mark returns an error if FAN_MARK_MOUNT was specified and the path was not a mount point (at least on the 6.1 kernel).

Things work well once we specify that FAN_MARK_MOUNT, though, so how to reliably determine, as quickly as possible, ideally race-free if a path is a mount point... That's the tricky part.

Combing through the output of strace stat -c '%m' mnt (where mnt is some valid mount point), we see a bunch of garbage:

statx(AT_FDCWD, "mnt", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_INO, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
getcwd("/tmp/wtr", 1024)                = 9
readlink("/tmp/wtr/mnt", 0x7ffd203cdbc0, 1023) = -1 EINVAL (Invalid argument)
openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0444, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(3, "22 28 0:20 / /sys rw,nosuid,node"..., 1024) = 1024
read(3, "/efi/efivars rw,nosuid,nodev,noe"..., 1024) = 1024
read(3, "ls/systemd-tmpfiles-setup-dev.se"..., 1024) = 1024
read(3, "78 26 0:53 / /run/user/1000 rw,n"..., 1024) = 1024
read(3, "878bec3/work\n834 26 0:4 net:[402"..., 1024) = 101
read(3, "", 1024)                       = 0
lseek(3, 0, SEEK_CUR)                   = 4197
close(3)                                = 0
newfstatat(AT_FDCWD, "/tmp/wtr/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
openat(AT_FDCWD, ".", O_RDONLY|O_CLOEXEC) = 3
chdir("mnt")                            = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
chdir("..")                             = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=65536, ...}, 0) = 0
chdir("..")                             = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
chdir("..")                             = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
getcwd("/", 4096)                       = 2
fchdir(3)                               = 0
close(3)                                = 0
newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
write(1, "/\n", 2/
)                      = 2

That is a lot of work to determine if some path is a mount point. Seems to be reading from procfs/self/mountpoints and then doing some checks on the parent paths afterwards (not entirely sure why yet).

(Side note -- And this is somewhat unavoidable given the filesystem APIs provided to us by the system -- But there are lots race conditions in that logic, with so many opportunities for some other part of a live system to change the type of file pointed to by that path while we are looking at it. We might have to live with something similar.)

There is a newer system call that would avoid a lot of this, statmount, but having been introduced in the 6.8 kernel, I'm not sure how much use it would get yet.

Anyway, we can fix this pretty easily. Maybe the performance costs of all that file io and system call stuff is worth it for what is likely to be a (relatively) rare enough thing to encounter -- mount points.

Feb 02 '25 17:02 e-dant

I'll investigate how inotifywait does this.

I also wonder if there's a way we can use FAN_MARK_FILESYSTEM to our advantage in some cases:

   FAN_MARK_FILESYSTEM (since Linux 4.20)
         Mark the filesystem specified by pathname.  The filesystem
         containing pathname will be marked.  All the contained
         files and directories of the filesystem from any mount
         point will be monitored.  Use of this flag requires the
         CAP_SYS_ADMIN capability.

I had been previously cautious about that because there are only so many times we really want to watch everything, more likely we just want to watch some things on a filesystem.

But it does divert this work in particular to the kernel, which is the best we can do.

Feb 02 '25 18:02 e-dant