fuse-mt
fuse-mt copied to clipboard
Why does the API have file handles and Paths in several calls?
I've been trialing a fuse filesystem implementation to see if it's workable to have a local filesystem that overflows into a NAS so you can have your multi-TB photo/video collection always available on your laptop. So far I've just implemented a basic naive in-RAM filesystem to try out the API:
https://github.com/pedrocr/syncer/blob/1110097ecac53dca8133b0fd8d883c1ea597dba4/src/main.rs
This seems to work fine but I have just ignored the file handles in the API completely and gone with the Paths at all time. Is this way of working completely broken? Having file handles is problematic for my design as I'm planning on having the backend be content addressable instead of having some kind of sequential inode number.
File handles are used when working on an opened file. When FUSE calls open
, you can give it some arbitrary u64 as a handle, and then when FUSE calls read
, write
, etc., it passes along that same handle you gave it.
Looking at your code, you don't seem to be implementing open
at all, so the file handle never gets set, so you can ignore it.
(To be honest, I'm surprised that it works with open
unimplemented. FUSE must have some special logic to ignore the ENOTIMPL
that comes from the default implementation of that function.)
Be aware, however, that doing things this way can behave strangely if you also implement unlink
(which you have), because in a proper POSIX filesystem, once you have a file open, even if you unlink that file, it should still be accessible through the open file handle. But in your filesystem, unlinking an open file causes opened handles to that file to no longer work. Other programs may behave badly in this case because they're not expecting that.
For example, the following C code will do the wrong thing:
int fd = open("file.txt", O_RDWR);
unlink("file.txt");
ssize_t result = write(fd, "hello", 5);
printf("result = %d, errno = %d\n", result, errno);
On a normal filesystem, it should print result = 5, errno = 0
meaning that it wrote 5 bytes even though the file was unlinked.
On yours, it will return result = -1, errno = 2
, where 2 is ENOENT
.
Thanks for the extremely prompt reply :)
I understand why unix has the file handles, my question was more about the API having several cases where a handle and a path are given. Sometimes the handle is optional. It seems the API is half way to abstracting away the handles but not quite there. For example chmod()
takes a &Path
and an Option<u64>
while in read()
the file handle is no longer optional but the path is still provided. How did you envision this being used? If read needs to always take the handle why is a path provided? If chmod always receives a Path why is the handle optional?
The file handles are optional for some calls because kernel may call them with or without having previously opened the file. Using chmod
as an example again: there exist two Linux syscalls -- chmod
which takes a path, and fchmod
which takes an open file descriptor. FUSE turns them both into one call which gets an inode number and maybe a file handle. Fuse-MT then turns the inode number back into a path, using a map it maintains.
The chmod()/fchmod()
case I had seen in the man page and assumed something of the sort was happening. I'm more puzzled how the Fuse-MT write()
is supposed to be used. Normal UNIX write()
is always on a file descriptor as far as I can tell, so why does the Fuse-MT API even give me a path? Should I read it as "this is the path the original file descriptor was obtained for but it may have been invalidated the moment it was used so don't trust it ever again for anything"? Why even have it in the API? For read-only filesystems to be easier to write?
For read-only filesystems to be easier to write?
Pretty much, yeah. FUSE gives us the file handle and the inode number, and Fuse-MT maintains an inode <-> path mapping, so it's easy and efficient to figure out which path is being referred to, so might as well provide it to the filesystem. It lets you write stateless read-only filesystems that don't depend on file handles, makes debugging / logging easier, etc.
But yeah, if you allow unlinking and/or hard linking, take that path with a grain of salt. In the case of hard links, the path given may not be the same as the path used to open the file, and in the case of unlinks, the path may not exist any more. Prefer using the file handle.
I've switched over to a fully inode based structure that I can then associate handles to:
https://github.com/pedrocr/syncer/blob/6b1a36f811942af35550532b595e5a348dabb922/src/main.rs
This seems to work fine although I need to run some kind of test suite on it to check for corner cases. Rust's more modern features over C/C++ really help here. I created a set of with_path()/with_handle()/with_path_optional_handle()
methods that abstract away all that stuff and allow the filesystem method implementations to be really simple.
For future reference it would probably help to add something to the FilesystemMT
documentation about paths/handles. Something like:
- If you're implementing a filesystem that does not allow removing files/directories or hard linking files them you can ignore file and directory handles and just use the Paths for everything.
fuse_mt
keeps an internal mapping that solves everything for you. - If you've implemented any of those features you should return handles from
open()/opendir()
and use them exclusively (ignoring the path) inread()/write()/readdir()
. In all the other calls that take an optional file handle you should use it if it is passed (and again ignore the path) and only if that's not the case use the path instead.
It seems like it might be an even nicer API if there were different traits for filesystems that implement unlinking and/or renaming versus those that do not. Then the API could be such that you couldn't make the mistake of not returning (or using) file handles. It would make for a more complicated API, but could also make it harder to accidentally create a non-POSIX filesystem.
On Tue, Jan 16, 2018 at 2:29 AM Pedro Côrte-Real [email protected] wrote:
I've switched over to a fully inode based structure that I can then associate handles to:
https://github.com/pedrocr/syncer/blob/6b1a36f811942af35550532b595e5a348dabb922/src/main.rs
This seems to work fine although I need to run some kind of test suite on it to check for corner cases. Rust's more modern features over C/C++ really help here. I created a set of with_path()/with_handle()/with_path_optional_handle() methods that abstract away all that stuff and allow the filesystem method implementations to be really simple.
For future reference it would probably help to add something to the FilesystemMT documentation about paths/handles. Something like:
- If you're implementing a filesystem that does not allow removing files/directories or hard linking files them you can ignore file and directory handles and just use the Paths for everything. fuse_mt keeps an internal mapping that solves everything for you.
- If you've implemented any of those features you should return handles from open()/opendir() and use them exclusively (ignoring the path) in read()/write()/readdir(). In all the other calls that take an optional file handle you should use it if it is passed (and again ignore the path) and only if that's not the case use the path instead.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wfraser/fuse-mt/issues/22#issuecomment-357918164, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIZKXtR9IbnER76YLXF7_7981Lh-__rks5tLHn7gaJpZM4RdXGy .
@droundy that's an interesting solution. Maybe a SimpleFilesystemMT
trait that just removes all the handles as well as open()/opendir()/rmdir()/unlink()/link()/release()/releasedir()
and FileSystemMT
drops path from read()/write()/flush()/release()/fsync()/readdir()/releasedir()/fsyncdir()
?
The problem is worse than @pedrocr and @droundy realize. Some FUSE filesystems actually need inode numbers, because that's how they organize themselves. While I like the other things you've done with fuse-mt, converting inode numbers into paths makes it useless for such filesystems. Would you consider making the conversion optional? Perhaps methods could have signatures like this:
struct File {
...
}
impl File {
fn path(&self) -> Option(&Path) {...}
fn ino(&self) -> u64 {...}
fn fh(&self) -> Option<u64> {...}
}
fn operation(
&self,
req: RequestInfo,
file: File,
args: SomethingElse
) -> ResultData
Even better would be if the FilesystemMT::init
method allowed the filesystem to return a flag indicating whether or not it wants pathname translation. Filesystems that don't care can disable it.