cargo icon indicating copy to clipboard operation
cargo copied to clipboard

cargo-package: list of dirty files contains ignored files

Open dotdash opened this issue 5 years ago • 13 comments

Problem When running "cargo package" it complains about some files containing uncommited change, even though they're ignored by my .gitignore rules.

This seems to only happen when the file is in a directory that contains a mix of ignored and non-ignored files, but no tracked files at all.

$ git status
On branch master
nothing to commit, working tree clean

$ mkdir new_dir
$ touch new_dir/file
$ touch new_dir/.file.swp

$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

        new_dir/

nothing added to commit but untracked files present (use "git add" to track)

$ git status new_dir/
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

        new_dir/file

nothing added to commit but untracked files present (use "git add" to track)

$ cargo package
error: 2 files in the working directory contain changes that were not yet committed into git:

new_dir/.file.swp
new_dir/file

I expect to only see the entry for new_dir/file there, not new_dir/.file.swp.

Notes

Output of cargo version:

$ cargo --version
cargo 1.44.1 (88ba85757 2020-06-11)

dotdash avatar Jun 25 '20 11:06 dotdash

What are your gitignore rules?

ehuss avatar Jun 25 '20 14:06 ehuss

Nothing special, just:

/target
*.rs.bk
Cargo.lock
.*.swp

I originally had .*.swp in my .config/git/ignore but added it to the local .gitignore as well, just to make sure that that is not causing it. Other files like foo.rs.bk also show up, so it's not the leading dot either.

dotdash avatar Jun 25 '20 15:06 dotdash

Ah, I see what is happening. Git reports the entire directory as untracked if there is at least one non-ignored file. Cargo walks the entire directory without further consulting git. Git has an option to recurse into untracked dirs, but then Cargo would need to apply additional filtering, like skipping the target directory. That isn't particularly easy since git is file-based, so it is tricky to detect if it has recursed into the target directory.

ehuss avatar Jun 25 '20 17:06 ehuss

As far as I'm concerned, I'd actually be fine with cargo just reporting the directory instead of the individual files. I do wonder whether you would really need additional filtering. I understand it that the option causes git to recurse into the directory and report the individual untracked files, but ignored files are still excluded, and there's no recursion into directories that are completely ignored anyway.

Without really knowing if that's the right place, I made the following change, and it seems to do the right thing:

diff --git src/cargo/sources/path.rs src/cargo/sources/path.rs
index cf406e8dd..eb5e26512 100644
--- src/cargo/sources/path.rs
+++ src/cargo/sources/path.rs
@@ -254,6 +254,7 @@ impl<'cfg> PathSource<'cfg> {
         });
         let mut opts = git2::StatusOptions::new();
         opts.include_untracked(true);
+        opts.recurse_untracked_dirs(true);
         if let Ok(suffix) = pkg_path.strip_prefix(root) {
             opts.pathspec(suffix);
         }

dotdash avatar Jun 26 '20 11:06 dotdash

To clarify: It does the right thing because the target directory is ignored by the default rule target/. If that rule is missing, you get a listing including all the untracked files in that directory.

I suppose there's already a special case that handles not recursing into "target", which is actually not needed when that directory is already ignored via .gitignore. So this would be a breaking change for users that don't ignore the target directory in their .gitignore, which to me seems rather unlikely, but that's just from my POV.

dotdash avatar Jun 26 '20 11:06 dotdash

Heh, so I found the special case that ignores the target directory here and it's broken in that it doesn't just ignore the target directory, but anything called target anywhere in the source tree.

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   data/target

no changes added to commit (use "git add" and/or "git commit -a")

$ cargo package
warning: manifest has no description, license, license-file, documentation, homepage or repository.
See https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata for more info.
   Packaging bla v0.1.0 (/home/bs/src/bla)
   Verifying bla v0.1.0 (/home/bs/src/bla)
   Compiling bla v0.1.0 (/home/bs/src/bla/target/package/bla-0.1.0)
    Finished dev [unoptimized + debuginfo] target(s) in 0.10s

So from my POV I think a viable approach would be to perform the change from above and remove the special case for target. Alternatively, keep the special case, but fix it to only apply to directories called target. What do you think?

dotdash avatar Jun 26 '20 12:06 dotdash

I think it would probably be fine to make the above addition of recurse_untracked_dirs, and fix the target check so that it only matches a directory named target in the root of the package.

ehuss avatar Jul 02 '20 20:07 ehuss

I am experiencing a possible variant of this, but even if it's not it's probably worth piggybacking on this issue to keep it simple.

First of all, I believe this will do my best to assure this gets fixed in the gitoxide implementation.

Now for the problem I am seeing:

# trying to publish `gix-index`
error: 2 files in the working directory contain changes that were not yet committed into git:

src/.DS_Store
src/tbd/.DS_Store

src/tbd/ is untracked and won't show up in git status as its content, .DS_Store is ignored. src/.DS_Store has more files but all .DS_Store files are ignored.

We can validate this with gix:

gix exclude query gix-index/src/.DS_Store
/Users/byron/.gitignore:4:.DS_Store     gix-index/src/.DS_Store

Even though it didn't reproduce this time, I also ran into problems with files that couldn't be packed into the crate due to (intentionally broken) symlinks, even though these are in folders that git-ignored as well.

CC @weihanglo

Byron avatar Apr 26 '23 19:04 Byron

Similar case here, I believe. On macOS, packaging fails for one of my packages because of verification

error: failed to verify package tarball

Caused by:
  Source directory was modified by build.rs during cargo publish. Build scripts should not modify anything outside of OUT_DIR.
  Added: /Users/jost/Desktop/holochain/holochain-client-rust/target/package/holochain_client-0.4.5-rc.0/.DS_Store

  To proceed despite this, pass the `--no-verify` flag.

I have another package on my system which I could package and publish without such issues. Both contain a .DS_Store file in the root dir. None has it in .gitignore.

As an attempt I added .DS_Store to .gitignore in the problematic package but it makes no difference.

This reproduces the issue on my system:

  • git clone [email protected]:holochain/holochain-client-rust.git cargo-test
  • cd cargo-test
  • cargo package

jost-s avatar Oct 11 '23 16:10 jost-s

I also ran into this issue on mcaOS as soon as I converted one of my packages to a workspace:

git clone https://github.com/cubing/cubing.rs && cd cubing.rs
git checkout 139fe2cca6985f1f6d980f46999df26fbfef87dc
cargo package --package cubing_core

The output ends with:

    Finished dev [unoptimized + debuginfo] target(s) in 45.45s
error: failed to verify package tarball

Caused by:
  Source directory was modified by build.rs during cargo publish. Build scripts should not modify anything outside of OUT_DIR.
  Added: /Users/lgarron/Code/git/github.com/cubing/cubing.rs/target/package/cubing_core-0.12.0/.DS_Store

  To proceed despite this, pass the `--no-verify` flag.

I'm using the --no-verify flag to get around this, but that makes me very uncomfortable.

lgarron avatar Dec 01 '23 10:12 lgarron

As an attempt I added .DS_Store to .gitignore in the problematic package but it makes no difference.

I was about to say that maybe it's related to the way the .gitignore rule is specified, for instance by (more) global git configuration)…

[core]
        excludesFile = ~/.gitignore

…but re-reading this means that it can't be the case.

However, I believe that once the next stage of #11813 (gitoxide support) is rolled out, the issue should disappear and I will be sure to try to verify it truly is fixed then. The reproducer mentioned in the comment right above does not reproduce for me by the way.

Byron avatar Dec 01 '23 10:12 Byron

For what it's worth, my issue was with target/package/cubing_core-0.12.0/.DS_Store but the gtop-level .gitignore includes the entire /target folder.

So my issues seems to be more a race condition with macOS creating .DS_Store files, which is something that is not really possible to control on the main disk.

lgarron avatar Dec 03 '23 01:12 lgarron

It's interesting as the list of source files (for packaging) is curated using the ignore crate, and the list of dirty files is ultimately produced by git2, and the intersection of both are rightfully rejected.

This means both algorithms, run at different times, would have picked up that single .DS_Store file located in an ignored folder, which doesn't seem very racy to me. Unfortunately I also don't have a better explanation just yet unless both, somehow, managed to interpret .gitignore rules incorrectly which seems unlikely if target/ is excluded in the repository itself.

From my experience, these errors also persist, so repeated cargo package invocations turn up the same dirty files.

Byron avatar Dec 03 '23 07:12 Byron

So my issues seems to be more a race condition with macOS creating .DS_Store files, which is something that is not really possible to control on the main disk.

I thought this issue might be gone, but alas. This is still causing me failures while publishing a workspace, at which point I have to dig in and:

  • Figure out/remember that this issue exists.
  • Manually identify the failing cargo publish command.
  • Pass --no-verify and hope that nothing goes wrong.
  • Remember that I actually need to run additional cargo publish commands were never invoked due to initial failures.

This is awfully error-prone for out-of-the-box behaviour when publishing from a workspace on macOS. 😭

lgarron avatar Jul 18 '24 10:07 lgarron

I am really sorry to hear that and per my previous comment still puzzled how that happens. My hope would be that the ignore crate can be replaced with gitoxide which I'd expect fixes any issues regarding .gitignore rules, if there are any in the first place.

Byron avatar Jul 18 '24 12:07 Byron