nushell icon indicating copy to clipboard operation
nushell copied to clipboard

Fix mv data loss when changing folder case (step 1)

Open rgwood opened this issue 2 years ago • 1 comments

Description

Related to #6583, which is a really ugly bug where mv can delete folders on Windows. The mv code works like this:

  1. Try to rename a directory with std::fs::rename()
  2. If that failed, fall back to fs_extra::dir::move_dir()

We had a bug that was causing directory renames to always fall back to fs_extra, which has a bug on case-insensitive filesystems. I've fixed our bug and have added tests to catch folder deletion when using mv.

The author of fs_extra is also working on a fix and I'll update our dependency when that's ready.

Tests

Make sure you've done the following:

  • [x] Add tests that cover your changes, either in the command examples, the crate/tests folder, or in the /tests folder.
  • [x] Try to think about corner cases and various ways how your changes could break. Cover them with tests.
  • [ ] If adding tests is not possible, please document in the PR body a minimal example with steps on how to reproduce so one can verify your change works.

Make sure you've run and fixed any issues with these commands:

  • [x] cargo fmt --all -- --check to check standard code formatting (cargo fmt --all applies these changes)
  • [x] cargo clippy --workspace --features=extra -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect to check that you're using the standard code style
  • [x] cargo test --workspace --features=extra to check that all the tests pass

rgwood avatar Sep 22 '22 05:09 rgwood

Dang, looks like some legitimate test failures on Ubuntu. It's late here, will take a closer look another time.

rgwood avatar Sep 22 '22 05:09 rgwood

Looking at the failed does_not_error_when_some_file_is_moving_into_itself test, I guess 11 gets renamed to 12 instead of moved to 12/11? (probably also why the other test fails, target directory gets replaced instead of used as parent)

So the extra !from.is_dir() check is probably incorrect, though looking at the docs I don't see why std::fs::rename would normally fail, so I'm not sure what you meant with

We had a bug that was causing directory renames to always fall back to fs_extra

Zapeth avatar Sep 22 '22 14:09 Zapeth

@Zapeth Say the user runs mv bar Bar in C:\foo\. Then this block runs:

if to.is_dir() {
        let from_file_name = match from.file_name() {
            Some(name) => name,
            None => return Err(ShellError::DirectoryNotFound(to_span, None)),
        };

        to.push(from_file_name);
    }

to ends up as C:\foo\Bar\bar instead of C:\foo\Bar. This was causing std::fs::rename(&from, &to) to fail.

But yeah, I think my initial fix was wrong. Maybe a special case in move_file for when from and to only differ by case would be better.

rgwood avatar Sep 22 '22 15:09 rgwood

to ends up as C:\foo\Bar\bar instead of C:\foo\Bar. This was causing std::fs::rename(&from, &to) to fail.

True, but C:\foo\Bar\bar should still be the expected result if Bar already exists.

(I would be interested to write a PR for this myself but I'm currently short on time, maybe I can look into it on the weekend if this hasn't been resolved until then)

Zapeth avatar Sep 22 '22 17:09 Zapeth

True, but C:\foo\Bar\bar should still be the expected result if Bar already exists.

Case insensitive filesystems are the problem here. to.is_dir() returns true even if the the directory name is bar on disk and Bar in to. C:\foo\Bar\bar is not correct if the user is trying to rename C:\foo\bar to C:\foo\Bar.

(I would be interested to write a PR for this myself but I'm currently short on time, maybe I look into it on the weekend if this hasn't been resolved until then)

That would be great thanks; we can play it by ear, I'm a little busy today but might be able to fix this up tomorrow.

rgwood avatar Sep 22 '22 17:09 rgwood

I guess one way to fix this would be to make the to.is_dir() check case-sensitive on all OSs, including Windows.

I'm not sure if there's a good cross-platform way to do that; we could check to against all file names in the directory (using read_dir()) but that's a little inefficient. If we're willing to do a Windows-specific fix, GetFileInformationByHandleEx could be used to get the actual filename as it exists on disk.

rgwood avatar Sep 23 '22 00:09 rgwood

This crate might help: https://docs.rs/same-file/latest/same_file/

rgwood avatar Sep 23 '22 00:09 rgwood

All righty, ended up using same_file and now the tests are passing. I'll merge this tomorrow if there are no objections.

Conveniently, we were already building same_file as a transitive dependency before this PR.

rgwood avatar Sep 23 '22 04:09 rgwood

Great, though I would have preferred not to rely on a new dependency for this, but I guess its fine if its a transitive dependency already.

Thanks for the quick fix :+1:

Zapeth avatar Sep 24 '22 07:09 Zapeth