On linux with python lib, when umounting the fs the monitored path is within, no events are generated
umount test
mount -t tmpfs tmpfs test
touch hi
-> no events generated
maybe inotify lib behavior? maybe the watchers monitors the inode instead of the path.
I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?
This also happens when a fs got mounted to a path / subfolder the program is watching on.
mkdir subdir
touch subdir/hi
-> events generated
mount -t tmpfs tmpfs subdir
touch subdir/hi2
-> no events generated
thinking and looking. might not be able to get around to this one today.
One thing we could do is try to create new inotify instances per-filesystem. (As a workaround, this would be somewhat equivalent to launching new instances of watcher on a per-filesystem basis.)
I don't see how to get more information out of inotify any other way.
I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?
I don't know how.
Testing with:
[ -d /tmp/wtr/mnt ] || mkdir -p /tmp/wtr/mnt ; cd /tmp/wtr ; sudo mount -t tmpfs tmpfs mnt ; touch mnt/a ; ll mnt ; sudo umount mnt
I see nothing extra from inotify when running with the following patch. No new events. We don't appear to be missing something on the release branch.
diff --git a/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp b/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
index ac861b9..d1d3ac2 100644
--- a/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
+++ b/devel/include/detail/wtr/watcher/adapter/linux/inotify/watch.hpp
@@ -78,13 +78,7 @@ struct ke_in_ev {
IN_OPEN
*/
static constexpr unsigned recv_mask
- = IN_CREATE
- | IN_DELETE
- | IN_DELETE_SELF
- | IN_MODIFY
- | IN_MOVE_SELF
- | IN_MOVED_FROM
- | IN_MOVED_TO;
+ = IN_ALL_EVENTS;
int fd = -1;
using paths = std::unordered_map<int, std::filesystem::path>;
@@ -328,6 +322,38 @@ inline auto do_ev_recv = [](auto const& cb, sysres& sr) -> result
return has_any && ! is_self_info;
};
+ auto dbg_inotify_event = [](inotify_event const* const in)
+ {
+ if (! in) {
+ fprintf(stderr, "inotify_event: null\n");
+ return;
+ }
+ auto msk = in->mask;
+ auto isdir = msk & IN_ISDIR;
+ auto iscreate = msk & IN_CREATE;
+ auto isdelete = msk & IN_DELETE;
+ auto ismodify = msk & IN_MODIFY;
+ auto ismove = msk & IN_MOVE;
+ auto isself = msk & IN_MOVE_SELF;
+ auto isfrom = msk & IN_MOVED_FROM;
+ auto isto = msk & IN_MOVED_TO;
+ fprintf(stderr, "inotify_event (msk=%s%s%s%s%s%s%s%s) {\n",
+ isdir ? "d" : ".",
+ iscreate ? "c" : ".",
+ isdelete ? "d" : ".",
+ ismodify ? "m" : ".",
+ ismove ? "m" : ".",
+ isself ? "s" : ".",
+ isfrom ? "f" : ".",
+ isto ? "t" : ".");
+ fprintf(stderr, " int wd: %d\n", in->wd);
+ fprintf(stderr, " uint32_t mask: %u\n", in->mask);
+ fprintf(stderr, " uint32_t cookie: %u\n", in->cookie);
+ fprintf(stderr, " uint32_t len: %u\n", in->len);
+ fprintf(stderr, " char name: %s\n", in->name);
+ fprintf(stderr, "}\n");
+ };
+
auto read_len = read(sr.ke.fd, sr.ke.ev_buf, sizeof(sr.ke.ev_buf));
if (read_len < 0 && errno != EAGAIN)
return result::e_sys_api_read;
@@ -337,6 +363,7 @@ inline auto do_ev_recv = [](auto const& cb, sysres& sr) -> result
unsigned in_ev_c = 0;
auto dmrm = defer_dm_rm_wd{sr.ke};
while (in_ev && in_ev < in_ev_tail) {
+ dbg_inotify_event(in_ev);
auto in_ev_next = peek(in_ev, in_ev_tail);
unsigned msk = in_ev->mask;
if (in_ev_c++ > ke_in_ev::c_ulim)
I don't see how to get more information out of inotify any other way.
I know inotify will capture fs umount events and generate a corresponding event. maybe it is a good idea to also pass that event as well?
I might be wrong about inotify noticing the umount events. It could be fanotify.
I have used the cli tool inotifywait from inotiy-tools and it does appears will notify the user with umount events.
https://www.man7.org/linux/man-pages/man1/inotifywait.1.html#EVENTS
inotifywait will use fanotify in kernel > 5.9 and I am running the program in a new kernel with root. Thus the umount event can be generated by fanotify instead.
Originally proposed this as it will be helpful so automated tools will potentially know when to terminate the watcher / restart it as the watcher does not persist accross fs mount / umount events.
For example, if monitoring a soft dynamically mounted network file system that is unreliable and often get disconnected, might be a good idea to inform downstream tools that monitoring have stopped. As automatic operations should not be done when it is disconnected / should auto restart after the mount get recovered. Currently it looks like there need to be significantly more logic involved when trying to detect and treat these events. Thus potentially the caller should handle this in OS - specific way.
AFAICT there are two ways to do this with fanotify. (ebpf could also do this in different ways.)
Both of our solutions need to know whether or not some given path is actually a mount point.
The relevant documentation for fanotify around FAN_MARK_MOUNT suggests that we can just OR that with our regular FAN_MARK_ADD here (and temporarily removing that is_dir check for testing this out).
Unfortunately, that does not seem to be the case. fanotify_mark returns an error if FAN_MARK_MOUNT was specified and the path was not a mount point (at least on the 6.1 kernel).
Things work well once we specify that FAN_MARK_MOUNT, though, so how to reliably determine, as quickly as possible, ideally race-free if a path is a mount point... That's the tricky part.
Combing through the output of strace stat -c '%m' mnt (where mnt is some valid mount point), we see a bunch of garbage:
statx(AT_FDCWD, "mnt", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_INO, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
getcwd("/tmp/wtr", 1024) = 9
readlink("/tmp/wtr/mnt", 0x7ffd203cdbc0, 1023) = -1 EINVAL (Invalid argument)
openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0444, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(3, "22 28 0:20 / /sys rw,nosuid,node"..., 1024) = 1024
read(3, "/efi/efivars rw,nosuid,nodev,noe"..., 1024) = 1024
read(3, "ls/systemd-tmpfiles-setup-dev.se"..., 1024) = 1024
read(3, "78 26 0:53 / /run/user/1000 rw,n"..., 1024) = 1024
read(3, "878bec3/work\n834 26 0:4 net:[402"..., 1024) = 101
read(3, "", 1024) = 0
lseek(3, 0, SEEK_CUR) = 4197
close(3) = 0
newfstatat(AT_FDCWD, "/tmp/wtr/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
openat(AT_FDCWD, ".", O_RDONLY|O_CLOEXEC) = 3
chdir("mnt") = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
chdir("..") = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=65536, ...}, 0) = 0
chdir("..") = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
chdir("..") = 0
newfstatat(AT_FDCWD, "..", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
getcwd("/", 4096) = 2
fchdir(3) = 0
close(3) = 0
newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
write(1, "/\n", 2/
) = 2
That is a lot of work to determine if some path is a mount point. Seems to be reading from procfs/self/mountpoints and then doing some checks on the parent paths afterwards (not entirely sure why yet).
(Side note -- And this is somewhat unavoidable given the filesystem APIs provided to us by the system -- But there are lots race conditions in that logic, with so many opportunities for some other part of a live system to change the type of file pointed to by that path while we are looking at it. We might have to live with something similar.)
There is a newer system call that would avoid a lot of this, statmount, but having been introduced in the 6.8 kernel, I'm not sure how much use it would get yet.
Anyway, we can fix this pretty easily. Maybe the performance costs of all that file io and system call stuff is worth it for what is likely to be a (relatively) rare enough thing to encounter -- mount points.
I'll investigate how inotifywait does this.
I also wonder if there's a way we can use FAN_MARK_FILESYSTEM to our advantage in some cases:
FAN_MARK_FILESYSTEM (since Linux 4.20) Mark the filesystem specified by pathname. The filesystem containing pathname will be marked. All the contained files and directories of the filesystem from any mount point will be monitored. Use of this flag requires the CAP_SYS_ADMIN capability.
I had been previously cautious about that because there are only so many times we really want to watch everything, more likely we just want to watch some things on a filesystem.
But it does divert this work in particular to the kernel, which is the best we can do.