pyfilesystem2 icon indicating copy to clipboard operation
pyfilesystem2 copied to clipboard

File is deleted / empty when moved / copied onto itself

Open tfeldmann opened this issue 3 years ago • 10 comments

As discussed in #542 a file is deleted in some cases when moved onto itself. Working example:

from fs import open_fs

with open_fs("mem://") as currdir:
    currdir.writetext("test.txt", "Content")
    currdir.move("test.txt", "test.txt", overwrite=True)
    assert currdir.exists("test.txt")
    # -> raises AssertionError

This happens in both fs.base.FS.move and fs.move.move_file. Due to the optimization in #523 this does not happen on OSFS, but applies to other filesystems with shared connections. For example when moving a file from one connection to a ftp server onto itself in another connection to the same ftp server.

For a solution a function is needed to check whether two different (fs, path) tuples point to the same resource.

tfeldmann avatar Jul 21 '22 12:07 tfeldmann

check whether two different (fs, path) tuples point to the same resource

I'm not sure there's any reasonable way that that can be done consistently for different fs objects? :thinking: So it should be possible to fix it for fs.base.FS.move, but I dunno it it'll be possible to fix it for fs.move.move_file ?

I guess something like

from fs import open_fs
from fs.move import move_file

with open_fs("mem://") as currdir:
    currdir.makedir("subdir")
    currdir.writetext("subdir/test.txt", "Content")
    with currdir.opendir("subdir") as subdir:
        move_file(currdir, "subdir/test.txt", subdir, "test.txt")
        assert subdir.exists("test.txt")

might also trigger this "erroneous behaviour"? As far as the move_file function is concerned, I guess it can't differentiate the above from

from fs import open_fs
from fs.move import move_file

with open_fs("mem://") as currdir:
    currdir.makedir("subdir")
    currdir.writetext("subdir/test.txt", "Content")
    with open_fs("mem://") as subdir:
        subdir.writetext("test.txt", "Content")
        move_file(currdir, "subdir/test.txt", subdir, "test.txt")
        assert subdir.exists("test.txt")

:thinking: :man_shrugging:

Although I guess at least for the FTP case (i.e. where we can't get a syspath), you could try getting the download URL https://docs.pyfilesystem.org/en/latest/reference/base.html#fs.base.FS.geturl and use that as an attempt to see if two different (fs, path) tuples match to the same resource?

lurch avatar Jul 21 '22 13:07 lurch

I'm not sure there's any reasonable way that that can be done consistently for different fs objects? 🤔 So it should be possible to fix it for fs.base.FS.move, but I dunno it it'll be possible to fix it for fs.move.move_file ?

The move method needs to be patched in multiple places. I'll create a draft PR for this.

For move_file I'd try to compare syspaths first, urls second and then try to find out whether we are on the same filesystem and compare normalized abspaths (paying attention to fs case-sensitivity). Please have a look at the changes I made in #519, these would be really helpful (although using __eq__ might not be the way to go).

Your second example works fine by the way, because you have two separate filesystems.

tfeldmann avatar Jul 22 '22 09:07 tfeldmann

Your second example works fine by the way, because you have two separate filesystems.

Yes, that's what I was illustrating - the second example "works fine" and the first example "doesn't work"; but as far as the move_file function is concerned, the two function-calls "look" identical:

move_file(currdir, "subdir/test.txt", subdir, "test.txt")

lurch avatar Jul 22 '22 10:07 lurch

Ah I see, didn't think about that!

So some additional checks are needed when encountering a SubFS? (compare to subdir.delegate_fs() and subdir.delegate_path(path))

tfeldmann avatar Jul 22 '22 10:07 tfeldmann

copy has kind of the same problem, but leaves an empty file:

from fs import open_fs

with open_fs("mem://") as currdir:
    currdir.writetext("test.txt", "Content")
    currdir.copy("test.txt", "test.txt", overwrite=True)
    assert currdir.readtext("test.txt") == "Content"

tfeldmann avatar Jul 22 '22 10:07 tfeldmann

Do similar problems also occur when trying to move or copy directories onto themselves?

lurch avatar Jul 22 '22 12:07 lurch

Yep. move_dir deletes the folder.

from fs import open_fs

with open_fs("mem://") as mem:
    folder = mem.makedir("folder")
    folder.writetext("file1.txt", "Hello1")
    sub = folder.makedir("sub")
    sub.writetext("file2.txt", "Hello2")

    mem.movedir("folder", "folder")
    mem.tree()

    assert mem.readtext("folder/file1.txt") == "Hello1"
    assert mem.readtext("folder/sub/file2.txt") == "Hello2"

copydir hangs forever:

from fs import open_fs

with open_fs("mem://") as mem:
    folder = mem.makedir("folder")
    folder.writetext("file1.txt", "Hello1")
    sub = folder.makedir("sub")
    sub.writetext("file2.txt", "Hello2")

    mem.copydir("folder", "folder/sub/")

tfeldmann avatar Jul 22 '22 14:07 tfeldmann

Moving a folder into a subfolder of itself is also not a good idea at the moment. This should raise an exception.

tfeldmann avatar Jul 22 '22 15:07 tfeldmann

copydir hangs forever mem.copydir("folder", "folder/sub/")

It hangs forever (gets stuck in an infinite loop) if you try copying a folder into a subfolder of itself, but if you do:

from fs import open_fs

with open_fs("mem://") as mem:
    folder = mem.makedir("folder")
    folder.writetext("file1.txt", "Hello1")
    sub = folder.makedir("sub")
    sub.writetext("file2.txt", "Hello2")

    mem.copydir("folder", "folder")
    mem.tree()

    assert mem.readtext("folder/file1.txt") == "Hello1"
    assert mem.readtext("folder/sub/file2.txt") == "Hello2"

then the files get truncated to being empty again.

lurch avatar Jul 22 '22 15:07 lurch

Seems like this mem.copydir("folder", "folder") is accidentally already repaired in my PR #547

tfeldmann avatar Jul 22 '22 15:07 tfeldmann