pyfilesystem2 MountFS's listdir doesn't take into account existing mounts

Hi, First of all thank you for this really good lib !

I'm hitting a small bug using MountFS: the listdir method doesn't take into account mounts for dir listing.

here's a simple test case:

#!/usr/bin/env python3
from fs.osfs import OSFS
from fs.mountfs import MountFS


mounted = OSFS('dir_a')
original = OSFS('dir_b')

final = MountFS()
final.mount('the_mount', mounted)
final.mount('', original)


print(f'Final FS /: {final.listdir("/")}')
print(f'mounted FS /: {mounted.listdir("/")}')
print(f'Final FS /the_mount: {final.listdir("/the_mount")}')

My fs is:

$ tree dir_a dir_b
dir_a
└── file_a
dir_b
└── file_b

and thus, the resulting final FS looks like so:

.
├── file_a
└── the_mount
    └── file_b

However running the script results in:

Final FS /: ['file_b']
mounted FS /: ['file_a']
Final FS /the_mount: ['file_a']

As we can see, the mounted FS is not taken into account by listdir, however it exists in the MountFS as show by the two last lines. The expected result should be:

Final FS /: ['file_b', 'the_mount']

I could try to hack something up but it would entail adding a final list to https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/mountfs.py#L159 , what do you think ?

Thanks and cheers ! Frank

Jun 18 '21 10:06 Frankkkkk

From a quick look at the mountfs.py file, it looks like things might be "going wrong" because of your usage of '' as the mount-path in final.mount('', original). However I dunno if the "bug" is that this behaviour should be allowed by mountfs, and mountfs is behaving incorrectly; or if mountfs should instead prevent empty strings being used as the mount-path? That's probably a decision for @willmcgugan to make...

Jun 18 '21 10:06 lurch

Indeed, that's a really good point.

However, if we prevent empty strings, I see no other alternative of doing the thing above, thus we would be reducing the featureset of pyfilesystem, which would be a shame ? I'm happy to propose a MR next week if that's okay with you.

Cheers and have a good WK !

Jun 18 '21 13:06 Frankkkkk

@willmcgugan While having another quick look at mountfs.py, there appears to be an error in the docstring for _delegate? It says it returns "(None, None) if no filesystem is mounted on the given path", but looking at the implementation it seems to return (self.default_fs, path) if no filesystem is mounted on the given path ?

@Frankkkkk Note also that MountFS works slightly differently to the mount command on Linux. On Linux you can have a mountpoint mounted at a directory on another mount, and the command will error if the directory doesn't exist. But looking at https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/mountfs.py#L102 it looks like it explicitly prevents overlapping mountpoints (which your empty mount-directory side-steps), and it automatically creates the 'virtual' directory that the mountpoint appears at.

P.S. @willmcgugan I wonder if a MountFS.unmount(path) method might be useful? :shrug:

Jun 18 '21 14:06 lurch

@Frankkkkk I've been playing about with MountFS this afternoon, and I think the assumption that MountFS.mount makes about the mount-paths being non-overlapping, means that allowing a mount-path of '' is indeed the "incorrect" behaviour, which needs to be fixed. I've also found another "incorrect" behaviour in MountFS, which I'll describe below...

However, if we prevent empty strings, I see no other alternative of doing the thing above

I think you might be able to achieve that using a combination of MountFS and MultiFS ? e.g.

#!/usr/bin/env python3
from fs.osfs import OSFS
from fs.mountfs import MountFS
from fs.multifs import MultiFS


mounted = OSFS('dir_a')
original = OSFS('dir_b')

intermediate = MountFS()
intermediate.mount('the_mount', mounted)

final = MultiFS()
final.add_fs('original', original)
final.add_fs('mount_fs', intermediate)

print(f'Final FS /: {final.listdir("/")}')
print(f'mounted FS /: {mounted.listdir("/")}')
print(f'Final FS /the_mount: {final.listdir("/the_mount")}')

prints:

Final FS /: ['the_mount', 'file_b']
mounted FS /: ['file_a']
Final FS /the_mount: ['file_a']

:smiley:

@willmcgugan @althonos Following on from the example-usage at https://docs.pyfilesystem.org/en/latest/reference/mountfs.html , here's a test-case that demonstrates the other MountFS bug that I found:

from fs.mountfs import MountFS
combined_fs = MountFS()
combined_fs.mount('config', config_fs)
combined_fs.mount('resources', resources_fs)

combined_fs.makedir('config/dir1') # all good, this creates 'dir1' in config_fs
combined_fs.makedir('resources/dir2') # all good, this creates 'dir2' in resources_fs
combined_fs.makedir('dir3') # oops, this creates 'dir3' in the internal MemoryFS used by MultiFS, and will be lost when combined_fs is closed!

I've got a WIP patch which I'll be PRing soon.

Note also that MountFS works slightly differently to the mount command on Linux.

I thought about adding a NestedMountFS that would work similarly to the Linux mount command, i.e. you could do something like:

from fs.memoryfs import MemoryFS
from fs.nestedmountfs import NestedMountFS

fs_1 = MemoryFS()
fs_1.makedir('dir1_a')
fs_1.makedir('dir1_b')

fs_2 = MemoryFS()
fs_2.makedir('dir2_a')
fs_2.makedir('dir2_b')

mountfs = NestedMountFS(rootfs=fs_1)
mountfs.mount('dir1_a', fs_2)
mountfs.tree()
# |-- dir1_a
# |   |-- dir2_a
# |   `-- dir2_b
# `-- dir1_b
mountfs.makedir('dir1_b/foo') # creates 'dir1_b/foo' in fs_1
mountfs.makedir('dir1_a/dir2_a/bar') # creates 'dir2_a/bar' in fs_2

and I think it should be fairly straightforward to implement (with most of the changes being in the mount and _delegate functions), but there's all kinds of awkward edge-conditions, like:

what happens (to mountfs) if you do fs_1.removedir('dir1_a')
what happens (to mountfs) if you do fs_1.makedir('dir1_a/dir2_b')
etc. etc.

so it might be easiest to just forget the idea of a NestedMountFS and instead rely on combining MountFS and MultiFS as in my earlier example, as they each have well-defined rules about how they behave with multiple filesystems.

Jun 20 '21 20:06 lurch

Hi @lurch really nice job !

Indeed, the workaround you proposed is good. The only problem is that this works fine for reading operations, but not for writing ones.

For example, I can't make the the_mount dir readonly as creating a file there creates it on the first fs, which is a shame.

You're right with a NestedMountFS. It could be really useful for some operations. But in this case, wouldn't fixing MountFS be better in the long term (instead of dealing with 2 quasi-identical fs) ?

Have a nice week !

Jun 21 '21 12:06 Frankkkkk

For example, I can't make the the_mount dir readonly as creating a file there creates it on the first fs, which is a shame.

You can change which fs is writeable, and also change the priority of the FSes, see https://docs.pyfilesystem.org/en/latest/reference/multifs.html#fs.multifs.MultiFS.add_fs :slightly_smiling_face:

You're right with a NestedMountFS. It could be really useful for some operations. But in this case, wouldn't fixing MountFS be better in the long term (instead of dealing with 2 quasi-identical fs) ?

I did think about that, but unfortunately the semantics between the current MountFS and the proposed NestedMountFS are too different, so it wouldn't be possible to "fix" MountFS to handle both scenarios. For example, this is possible with MountFS:

mount_fs = MountFS()
fs1 = OSFS('path1')
fs2 = OSFS('path2')
mount_fs.mount("/subdir/m1", fs1)
mount_fs.mount("/subdir/m2", fs2)
# creating a file in `/subdir/m1` puts it in fs1, creating a file in `/subdir/m2` puts it in fs2, trying to create a file in `subdir` throws an error

but that wouldn't be possible for NestedMountFS, because NestedMountFS would require the mount-paths to exist before mounting the FSes to them. I.e. the NestedMountFS equivalent of the above would be something like:

basefs = OSFS('basepath')
basefs.makedir('subdir')
basefs.makedir('subdir/m2')
basefs.makedir('subdir/m1')
mount_fs = NestedMountFS(rootfs=basefs)
fs1 = OSFS('path1')
fs2 = OSFS('path2')
mount_fs.mount("/subdir/m1", fs1)
mount_fs.mount("/subdir/m2", fs2)
# creating a file in `/subdir/m1` puts it in fs1, creating a file in `/subdir/m2` puts it in fs2, creating a file in `/subdir` puts it in basefs

If you have any suggested solutions for the edge-conditions I described in my previous message, I'm all ears... :ear:

MountFS solves the separate-changes-in-different-FSes problem by deliberately making the FS root directories non-overlapping, so each FS remains writeable. MultiFS solves the separate-changes-in-different-FSes problem by having the FSes rooted at the same place, having a strict "priority ordering" that the FSes get looked at, and declaring that only one of the FSes is writeable. Hmmm, I wonder if a NestedMultiFS would be a more robust way to handle the edge-conditions than a NestedMountFS? (i.e. drop the requirement that the mount-path needs to exist first, but only allow a single FS to be writeable) :thinking:

multi_fs = NestedMultiFS()
fs1 = OSFS('path1')
fs2 = OSFS('path2')
multi_fs.mount("/subdir", fs1)
multi_fs.mount("/subdir/m2", fs2, write=True)
# creating a file in `/subdir/m2` puts it in fs2, trying to create a file in `subdir` throws an error
# Although what happens if we do
fs3 = OSFS('path3')
multi_fs.mount("/subdir/m2/m3", fs3)
# and then try to create a file in `/subdir/m2/m3` - should it create it in fs2, or should it throw an error because fs3 isn't writeable?

I'd need to spend some more time playing with the existing MultiFS to get a feel for how it works (especially with regards to deletions & renames).

Have a nice week !

You too!

Jun 21 '21 14:06 lurch

@lurch Sorry for the late reply. TBH it's been such a long time since I implemented that, and I don't recall my reasoning there. There's quite possibly edge cases I didn't consider at the time. I'll trust your assessment!

Jun 21 '21 14:06 willmcgugan

pyfilesystem2 pyfilesystem2 copied to clipboard

MountFS's listdir doesn't take into account existing mounts

pyfilesystem2
pyfilesystem2 copied to clipboard