cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Exclude non-`include`d dotfiles regardless of the used VCS

Open newpavlov opened this issue 1 month ago • 9 comments

Currently cargo package follows the following rules:

If include is not specified, then the following files will be excluded:

  • If the package is not in a git repository, all “hidden” files starting with a dot will be skipped.
  • If the package is in a git repository, any files that are ignored by the gitignore rules of the repository and global git configuration will be skipped.

This handling is arguably inconsistent and results in surprising behavior such as inclusion of .github/ folder for one-crate repositories (e.g. see here).

I suggest to remove the git exception and always exclude dotfiles which are not listed in the include field.

Previous IRLO discussion: https://internals.rust-lang.org/t/23700

newpavlov avatar Nov 16 '25 22:11 newpavlov

If we're going to consider this, it would help to have a list of dotfiles in all currently published crates, sorted by frequency. That would help us gauge the impact, both positive and negative.

If we don't have clear evidence that it would almost always be helpful and never be harmful, then I would personally argue for the principle of least surprise by not excluding things.

joshtriplett avatar Nov 16 '25 23:11 joshtriplett

This will be a breaking change, and hard to discover even when across the edition boundary until it really published

If we had a chance to change it, I would lean towards removing all heuristics for consistency

weihanglo avatar Nov 16 '25 23:11 weihanglo

I would personally argue for the principle of least surprise by not excluding things

IMO making an exception for git goes against this principle. Excluding files in .gitignore (or any other VCS-specific exclusion rules file) makes sense, but ignoring dotfiles only on non-git VCSes is certainly very surprising.

Additionally, as I wrote in the IRLO thread, I was surprised that published packages by default include .github/ and .gitignore. Granted, this surprise in more subjective, but it still can be viewed as not following the principle.

If we had a chance to change it, I would lean towards removing all heuristics for consistency

I think it's worth to investigate how many crates rely on inclusion of dotfiles to work properly and how many crates include dotfiles unnecessarily (including crates which manually exclude dotfiles in their Cargo.toml). Obviously, breaking crates by not including dotfiles has a bigger impact than ballooning package sizes, so the former should have a higher weight than the latter. But if we have 5-10 times or more crates in the latter category, I think the dotfiles exclusion heuristic should be considered a useful one.

newpavlov avatar Nov 17 '25 00:11 newpavlov

That would help us gauge the impact, both positive and negative.

Without looking at the .crate files, we do have one indicator for this: cargo vendor users.

When you run cargo package or cargo publish in a git repo, you get the .gitignores rules. Then when someone runs cargo vendor on that (until #15514), it was no longer in a git repo and you got the dot file exclusion rules, losing dot files.

Issues reported on this:

  • #13662
  • #15080
  • #13691

Looking over those, when they were filed, the comments and emojis, cross-links, etc, it doesn't appear that this was a very big issue.

epage avatar Nov 17 '25 17:11 epage

If we had a chance to change it, I would lean towards removing all heuristics for consistency

imo the current heuristics are based on the user giving a pretty high quality signal of what is important.

epage avatar Nov 17 '25 17:11 epage

From the Internals thread: https://internals.rust-lang.org/t/exclude-github-and-gitignore-from-published-packages-by-default/23700/3?u=epage

Could the export-ignore property be used instead? I don't see why something should be in the .crate that if git archive would ignore it too. I just don't want to see cargo grow a menagerie of default exclusions for every forge's hidden directory (sr.ht, foegejo, gitlab) that changes based on the version in use.

Personally,

  • this feels obscure to rely on it as an existing signal
  • I'm concerned about what conventions are for source archives and if they are well aligned with .crate, both ways (e.g. we don't want people messing up their source archives for cargo's sake)
  • this is git specific and if we over rely on it, we hurt the experience for non-git users

epage avatar Nov 17 '25 17:11 epage

Looking over the internals thread, we discussed various dot files that may be a problem but only came up with:

  • .gitignore: we already did what it said
  • .cargo/config.toml: this is never read and its existence is a point of confusion

To me, the biggest risk is this would close the door on #14001 and we'd need to decide on that first (unless we made an exemption for .cargo).

epage avatar Nov 17 '25 17:11 epage

My proposal for this:

  • Add a new package.exclude-hidden = <bool>
  • On the next edition, change the default to package.exclude-hidden = true
    • cargo fix could add package.exclude-hidden = false to
      • all non-package.publish = false packages
      • all non-package.include packages
      • any package that reports hidden files from cargo package --list

Concerns

  • Impact on #11405
  • Ignoring of .cargo/config.toml (#11405)
  • Ignoring of .keep files

epage avatar Nov 17 '25 17:11 epage

unless we made an exemption for .cargo

Making an exception for .cargo sounds like a bad idea to me. For example, I plan to use .cargo/config.toml with resolver.incompatible-rust-versions = "allow" in my repositories and all config.toml files which I've seen in practice (e.g. for Web WASM testing) are not relevant for published packages.

The case of #14001 looks like a really bad hack which should be discouraged. Plus such projects could always manually include .cargo if truly necessary.

The exclude-hidden proposal sounds good to me.

newpavlov avatar Nov 17 '25 17:11 newpavlov