unionfs-fuse
unionfs-fuse copied to clipboard
Use at-family of syscalls instead of building paths
We use unionfs to overlay a ramdisk and NFS server to create the / filesystem of network booted nodes. This makes it kind of hard to have the individual branches visible and reachable by path. Any access to
But there is a better option. The *at family of syscalls allows accessing files relative to a file descriptor of a directory. Unionfs can open the directory for each branch and store the FD. Then when trying to access a file on a branch instead of building a new, branch specific, path the respective *at function can be used with the stored FD. E.g. readlink(branch_path, buf, bufsize) becomes readlink(branch_fd, path, bug, bufsize).
This has 2 benefits:
- No need to create a individual path for each branch. Which means this is faster and avoids the problem of hitting ENAMETOOLONG because the combined path is too long.
- avoids the looping and the need to chroot
We've been using a patched unionfs 0.24 with *at() support for years and are now looking into updating to 2.0. So I'm wondering if there is interest for an updated patch and how it should look like.
The current patch for 0.24 adds a compile time option and (if enabled) replaces all filesystem syscalls with vardiac macros that take a branch and paths, e.g.
+int branch_symlink(const char *oldpath, int branch, ...); +#define SYMLINK(oldpath, branch, ...) branch_symlink(oldpath, branch, VA_ARGS, NULL)
-
if (symlink(link, cow->to_path)) {
-
if (SYMLINK(link, cow->to_branch, cow->to_path)) {
The reason for the vardiac macro is that in places paths are created inside unionfs, e.g. to add _HIDDEN~ to the filename in filedir_hidden(). The macro avoid constructing one path to add the _HIDDEN~ and then another to add the branch path in front (and fix the bug that ENAMETOOLONG is not checked here).
yes, i'll be interested in that! ...do you have the 0.24 patch somewhere?
https://github.com/Q-Leap-Networks/unionfs-fuse/tree/feature-at-family
oh, boy, that's huge... :-( ...any special reason you picked macros and any special reason you picked varargs? (why not just add ifdefs?)
In some places unionfs adds things to the path, like the _HIDDEN~. With the varargs the code can just pass all parts of the path as argument and the final build_path call will build a complete path once. The idea was to only copy all the path parts and check for ENAMETOOLONG once. And iirc the macros are needed so the varargs work.
Ideally the path shouldn't be copied at all for most operations when using *at functions, only a few places need it. But without them it needs to copy. The macros and varargs allowed to use the same API in all cases at the cost of copying always once. Something worth reconsidering.
Note: The original code did not check for ENAMETOOLONG in several places.