pyfilesystem2
pyfilesystem2 copied to clipboard
MountFS's listdir doesn't take into account existing mounts
Hi, First of all thank you for this really good lib !
I'm hitting a small bug using MountFS: the listdir
method doesn't take into account mounts for dir listing.
here's a simple test case:
#!/usr/bin/env python3
from fs.osfs import OSFS
from fs.mountfs import MountFS
mounted = OSFS('dir_a')
original = OSFS('dir_b')
final = MountFS()
final.mount('the_mount', mounted)
final.mount('', original)
print(f'Final FS /: {final.listdir("/")}')
print(f'mounted FS /: {mounted.listdir("/")}')
print(f'Final FS /the_mount: {final.listdir("/the_mount")}')
My fs is:
$ tree dir_a dir_b
dir_a
└── file_a
dir_b
└── file_b
and thus, the resulting final
FS looks like so:
.
├── file_a
└── the_mount
└── file_b
However running the script results in:
Final FS /: ['file_b']
mounted FS /: ['file_a']
Final FS /the_mount: ['file_a']
As we can see, the mounted
FS is not taken into account by listdir
, however it exists in the MountFS
as show by the two last lines. The expected result should be:
Final FS /: ['file_b', 'the_mount']
I could try to hack something up but it would entail adding a final list to https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/mountfs.py#L159 , what do you think ?
Thanks and cheers ! Frank
From a quick look at the mountfs.py
file, it looks like things might be "going wrong" because of your usage of ''
as the mount-path in final.mount('', original)
.
However I dunno if the "bug" is that this behaviour should be allowed by mountfs, and mountfs is behaving incorrectly; or if mountfs should instead prevent empty strings being used as the mount-path? That's probably a decision for @willmcgugan to make...
Indeed, that's a really good point.
However, if we prevent empty strings, I see no other alternative of doing the thing above, thus we would be reducing the featureset of pyfilesystem, which would be a shame ? I'm happy to propose a MR next week if that's okay with you.
Cheers and have a good WK !
@willmcgugan While having another quick look at mountfs.py
, there appears to be an error in the docstring for _delegate
? It says it returns "(None, None)
if no filesystem is mounted on the given path
", but looking at the implementation it seems to return (self.default_fs, path)
if no filesystem is mounted on the given path
?
@Frankkkkk Note also that MountFS works slightly differently to the mount
command on Linux. On Linux you can have a mountpoint mounted at a directory on another mount, and the command will error if the directory doesn't exist. But looking at https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/mountfs.py#L102 it looks like it explicitly prevents overlapping mountpoints (which your empty mount-directory side-steps), and it automatically creates the 'virtual' directory that the mountpoint appears at.
P.S. @willmcgugan I wonder if a MountFS.unmount(path) method might be useful? :shrug:
@Frankkkkk I've been playing about with MountFS this afternoon, and I think the assumption that MountFS.mount
makes about the mount-paths being non-overlapping, means that allowing a mount-path of ''
is indeed the "incorrect" behaviour, which needs to be fixed. I've also found another "incorrect" behaviour in MountFS, which I'll describe below...
However, if we prevent empty strings, I see no other alternative of doing the thing above
I think you might be able to achieve that using a combination of MountFS and MultiFS ? e.g.
#!/usr/bin/env python3
from fs.osfs import OSFS
from fs.mountfs import MountFS
from fs.multifs import MultiFS
mounted = OSFS('dir_a')
original = OSFS('dir_b')
intermediate = MountFS()
intermediate.mount('the_mount', mounted)
final = MultiFS()
final.add_fs('original', original)
final.add_fs('mount_fs', intermediate)
print(f'Final FS /: {final.listdir("/")}')
print(f'mounted FS /: {mounted.listdir("/")}')
print(f'Final FS /the_mount: {final.listdir("/the_mount")}')
prints:
Final FS /: ['the_mount', 'file_b']
mounted FS /: ['file_a']
Final FS /the_mount: ['file_a']
:smiley:
@willmcgugan @althonos Following on from the example-usage at https://docs.pyfilesystem.org/en/latest/reference/mountfs.html , here's a test-case that demonstrates the other MountFS bug that I found:
from fs.mountfs import MountFS
combined_fs = MountFS()
combined_fs.mount('config', config_fs)
combined_fs.mount('resources', resources_fs)
combined_fs.makedir('config/dir1') # all good, this creates 'dir1' in config_fs
combined_fs.makedir('resources/dir2') # all good, this creates 'dir2' in resources_fs
combined_fs.makedir('dir3') # oops, this creates 'dir3' in the internal MemoryFS used by MultiFS, and will be lost when combined_fs is closed!
I've got a WIP patch which I'll be PRing soon.
Note also that MountFS works slightly differently to the mount command on Linux.
I thought about adding a NestedMountFS
that would work similarly to the Linux mount command, i.e. you could do something like:
from fs.memoryfs import MemoryFS
from fs.nestedmountfs import NestedMountFS
fs_1 = MemoryFS()
fs_1.makedir('dir1_a')
fs_1.makedir('dir1_b')
fs_2 = MemoryFS()
fs_2.makedir('dir2_a')
fs_2.makedir('dir2_b')
mountfs = NestedMountFS(rootfs=fs_1)
mountfs.mount('dir1_a', fs_2)
mountfs.tree()
# |-- dir1_a
# | |-- dir2_a
# | `-- dir2_b
# `-- dir1_b
mountfs.makedir('dir1_b/foo') # creates 'dir1_b/foo' in fs_1
mountfs.makedir('dir1_a/dir2_a/bar') # creates 'dir2_a/bar' in fs_2
and I think it should be fairly straightforward to implement (with most of the changes being in the mount
and _delegate
functions), but there's all kinds of awkward edge-conditions, like:
- what happens (to
mountfs
) if you dofs_1.removedir('dir1_a')
- what happens (to
mountfs
) if you dofs_1.makedir('dir1_a/dir2_b')
- etc. etc.
so it might be easiest to just forget the idea of a NestedMountFS
and instead rely on combining MountFS
and MultiFS
as in my earlier example, as they each have well-defined rules about how they behave with multiple filesystems.
Hi @lurch really nice job !
Indeed, the workaround you proposed is good. The only problem is that this works fine for reading operations, but not for writing ones.
For example, I can't make the the_mount
dir readonly as creating a file there creates it on the first fs, which is a shame.
You're right with a NestedMountFS
. It could be really useful for some operations. But in this case, wouldn't fixing MountFS
be better in the long term (instead of dealing with 2 quasi-identical fs) ?
Have a nice week !
For example, I can't make the the_mount dir readonly as creating a file there creates it on the first fs, which is a shame.
You can change which fs is writeable, and also change the priority of the FSes, see https://docs.pyfilesystem.org/en/latest/reference/multifs.html#fs.multifs.MultiFS.add_fs :slightly_smiling_face:
You're right with a NestedMountFS. It could be really useful for some operations. But in this case, wouldn't fixing MountFS be better in the long term (instead of dealing with 2 quasi-identical fs) ?
I did think about that, but unfortunately the semantics between the current MountFS and the proposed NestedMountFS are too different, so it wouldn't be possible to "fix" MountFS to handle both scenarios. For example, this is possible with MountFS:
mount_fs = MountFS()
fs1 = OSFS('path1')
fs2 = OSFS('path2')
mount_fs.mount("/subdir/m1", fs1)
mount_fs.mount("/subdir/m2", fs2)
# creating a file in `/subdir/m1` puts it in fs1, creating a file in `/subdir/m2` puts it in fs2, trying to create a file in `subdir` throws an error
but that wouldn't be possible for NestedMountFS, because NestedMountFS would require the mount-paths to exist before mounting the FSes to them. I.e. the NestedMountFS equivalent of the above would be something like:
basefs = OSFS('basepath')
basefs.makedir('subdir')
basefs.makedir('subdir/m2')
basefs.makedir('subdir/m1')
mount_fs = NestedMountFS(rootfs=basefs)
fs1 = OSFS('path1')
fs2 = OSFS('path2')
mount_fs.mount("/subdir/m1", fs1)
mount_fs.mount("/subdir/m2", fs2)
# creating a file in `/subdir/m1` puts it in fs1, creating a file in `/subdir/m2` puts it in fs2, creating a file in `/subdir` puts it in basefs
If you have any suggested solutions for the edge-conditions I described in my previous message, I'm all ears... :ear:
MountFS solves the separate-changes-in-different-FSes problem by deliberately making the FS root directories non-overlapping, so each FS remains writeable. MultiFS solves the separate-changes-in-different-FSes problem by having the FSes rooted at the same place, having a strict "priority ordering" that the FSes get looked at, and declaring that only one of the FSes is writeable. Hmmm, I wonder if a NestedMultiFS would be a more robust way to handle the edge-conditions than a NestedMountFS? (i.e. drop the requirement that the mount-path needs to exist first, but only allow a single FS to be writeable) :thinking:
multi_fs = NestedMultiFS()
fs1 = OSFS('path1')
fs2 = OSFS('path2')
multi_fs.mount("/subdir", fs1)
multi_fs.mount("/subdir/m2", fs2, write=True)
# creating a file in `/subdir/m2` puts it in fs2, trying to create a file in `subdir` throws an error
# Although what happens if we do
fs3 = OSFS('path3')
multi_fs.mount("/subdir/m2/m3", fs3)
# and then try to create a file in `/subdir/m2/m3` - should it create it in fs2, or should it throw an error because fs3 isn't writeable?
I'd need to spend some more time playing with the existing MultiFS to get a feel for how it works (especially with regards to deletions & renames).
Have a nice week !
You too!
@lurch Sorry for the late reply. TBH it's been such a long time since I implemented that, and I don't recall my reasoning there. There's quite possibly edge cases I didn't consider at the time. I'll trust your assessment!