cargo icon indicating copy to clipboard operation
cargo copied to clipboard

fix(toml): Warn, rather than fail publish, if a target is excluded

Open epage opened this issue 3 months ago • 10 comments

What does this PR try to resolve?

We have a couple of problems with publishing

  • Inconsistent errors: if a target that package doesn't verify is missing path, it will error, while one with path won't, see #13456
  • Users may want to exclude targets and their choices are
    • Go ahead and include them. I originally excluded my examples before doc-scraping was a think. The problem was if I had to set required-features, I then could no longer exclude them
    • Muck with Cargo.toml during publish and pass --allow-dirty

This fixes both by auto-stripping targets on publish. We will warn the user that we did so.

This is a mostly-one-way door on behavior because we are turning an error case into a warning. For the most part, I think this is the right thing to do. My biggest regret is that the warning is only during package/publish as it will be too late to act on it and people who want to know will want to know when the problem is introduced. The error is also very late in the process but at least its before a non-reversible action has been taken. Dry-run and yank help.

Fixes #13456 Fixes #5806

How should we test and review this PR?

Tests are added in the first commit and you can then follow the commits to see how the test output evolved.

The biggest risk factors for this change are

  • If the target-stripping logic mis-identifies a path as excluded because of innocuous path differences (e.g. case)
  • Setting a minimum MSRV for published packages: auto* were added in 1.27 (#5335) but were insta-stable. autobins = false did nothing until 1.32 (#6329). I have not checked to see how this behaves pre-1.32 or pre-1.27. Since my memory of that error is vague, I believe it will either do a redundant discovery or it will implicitly skip discovery

Resolved risks

  • #13729 ensured our generated target paths don't have \ in them
  • #13729 ensures the paths are normalize so the list of packaged paths

For case-insensitive filesystems, I added tests to show the original behavior (works locally but will fail when depended on from a case-sensitive filesystem) and tracked how that changed with this PR (on publish warn that those targets are stripped). We could try to normalize the case but it will also follow symlinks and is likely indicative of larger casing problems that the user had. Weighing how broken things are now , it didn't seem changing behavior on this would be too big of a deal.

We should do a Call for Testing when this hits nightly to have people to cargo package and look for targets exclusion warnings that don't make sense.

Additional information

This builds on #13701 and the work before it.

By enumerating all targets in Cargo.toml, it makes it so rust-lang/crates.io#5882 and rust-lang/crates.io#814 can be implemented without any other filesystem interactions.

A follow up PR is need to make much of a difference in performance because we unconditionally walk the file system just in case autodiscover != Some(false) or a target is missing a path.

We cannot turn off auto-discovery of libs, so that will always be done for bin-only packages.

epage avatar Apr 05 '24 19:04 epage

r? @ehuss

rustbot has assigned @ehuss. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot avatar Apr 05 '24 19:04 rustbot

Blocked on #13701. Posting now to get Windows testing.

epage avatar Apr 05 '24 19:04 epage

@ChrisDenton I'd be curious if you have any thoughts on the case-preserving change in this PR.

Existing behavior (shown in the first commit): we can detect SrC/MaIn.rs as src/main.rs and publish it. Windows consumers will be able to use the package. Linux consumers of it won;'t.

With us tracking targets in the published package, src/main.rs (what cargo sees) and SrC/MaIn.rs (what the filesystem sees) wouldn't match and we'd strip the target from the published manifest, preventing anyone from using it. The publisher will get a warning.

In the last commit, I use canonicalize to try to read the case from the filesystem and preserve that through Cargo, allowing the targets to be included and Linux users can now consume them.

This is shown in the commit history of the PR with the changes to the package::normalize_case test.

The one case not handled is if the user sets path = "src/main.rs" but the file is SrC/MaIn.rs then we will still strip the target.

epage avatar Apr 10 '24 22:04 epage

Hmm... I would just quickly note the case sensitivity is not necessarily just a cross-platform concern. E.g. Windows can have case-sensitive directories. So anywhere there is a mismatch between expected and actual case means that it's non-portable even possibly within the same system. This does make erroring on detected case mismatches more appealing to me as I'd personally want to be made very aware of the situation before I publish, especially if Cargo is unable to handle all such cases automatically. However, I can also see how that might be frustrating for other users so I don't feel that strongly about it. "It just works" is always appealing.

On a technical level I would worry a bit about canonical paths not necessarily matching the packaged directory structure, due to symlinks or something. Admittedly I'm uncertain if this would be a problem in practice or not.

ChrisDenton avatar Apr 10 '24 23:04 ChrisDenton

@ChrisDenton thanks for the input! I've taken out that commit for now and instead going with the approach of it needing a strong justification to go in. I want to do a call-for-testing on this when it hits nightly and will particularly call out publishing from Windows.

epage avatar Apr 11 '24 01:04 epage

Starting the conversation on the behavior changes in parallel to the review of the implementation

This change snapshots implicit build targets into the published package, making them explicit.

Immediate benefits

  • Makes implicit targets, explicit-without-path targets, and explicit-with-path targets consistent for error reporting (#13456) and for intentionally excluding targets (not needing --allow-dirty (as discussed in #13695)
  • Disallows adding build.rs to vendored source (#5806)
  • As a crate maintainer, makes it easier to inspect what the published result is

This unblocks

  • Performance optimizations. This allows us to shift all of the burden onto TOML parsing which would likely be easier to optimize
  • crates.io specializing their rendering on bin vs lib (rust-lang/crates.io#5882, rust-lang/crates.io#814)

Behavior changes that I'm looking for consensus on:

  • Switching some publish errors to warnings
    • Excluded implicit target
      • Before: silently ignored
      • After: ignored with warning
    • Excluded explicit target without a path
      • Before: error since a path can't be discovered
      • After: ignored with warning
    • Excluded explicit target with a path
      • Before: error if its a bin/lib, silently ignored otherwise
      • After: ignored with warning
  • Case mismatches
    • Situations:
      • Implicit file target with alt case of (e.g. src/Main.rs)
      • Implicit target search with alt case of (e.g. Src/Bin/)
      • Explicit target without path whose name has different case than actual file
      • Explicit target whose path has different case than the actual file
    • Before:
      • Publish worked
      • Package would work on case-insensitive filesystems
      • Package would fail on case-sensitive filesystems
    • After:
      • Ignored with a warning during publish
    • Justification:
      • This is a portability hazard, even within the same machine
      • We've not seen people organically bring up the problem that their packages don't work on other platforms. The main organic case situation was cargo.toml and that mostly came about because windows users didn't think to add the required case. This is the opposite.
      • These case mismatches could be indicative of more fundamental case problems and this could have a long tail of holes to patch
    • Alternative: normalize the case
      • One option is to use canonicalize but that will also resolve symlinks
      • We could enumerate directories and find a best match. This avoids "how is case-insensitivity implemented" because the cases we care about are ASCII. However, we don't really know whether the filesystem is case-sensitive or not

See changes to tests/testsuite/package.rs, particularly normalize_case and discovery* tests

@rfcbot fcp merge

epage avatar Apr 11 '24 14:04 epage

Team member @epage has proposed to merge this. The next step is review by the rest of the tagged team members:

  • [x] @Eh2406
  • [x] @Muscraft
  • [x] @arlosi
  • [x] @ehuss
  • [x] @epage
  • [ ] @joshtriplett
  • [x] @weihanglo

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot avatar Apr 11 '24 14:04 rfcbot

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot avatar Apr 20 '24 04:04 rfcbot

Although this is not technically a breaking change, this is still a behavior change that we love to have a longer window for users to test. The next no tool breakage week is approaching and I will merge this after a new beta branch comes out.

weihanglo avatar Apr 24 '24 18:04 weihanglo

:umbrella: The latest upstream changes (presumably #13804) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Apr 26 '24 17:04 bors

@bors r+

Thanks!

weihanglo avatar Apr 29 '24 18:04 weihanglo

:pushpin: Commit 8cc2f391ba4c72cee9129342e47b1ecb7ba959d3 has been approved by weihanglo

It is now in the queue for this repository.

bors avatar Apr 29 '24 18:04 bors

:hourglass: Testing commit 8cc2f391ba4c72cee9129342e47b1ecb7ba959d3 with merge ba6fb405284df654362275405175f3fac2b79dbf...

bors avatar Apr 29 '24 18:04 bors

:sunny: Test successful - checks-actions Approved by: weihanglo Pushing ba6fb405284df654362275405175f3fac2b79dbf to master...

bors avatar Apr 29 '24 18:04 bors

How is this related to #5806?

glandium avatar Apr 29 '24 20:04 glandium

If I understand correctly cargo vendor will vendor "published" Cargo.toml files. With #5806, you could drop a build.rs and it would be automatically picked up. After this change, it will be ignored and the only way to add a build.rs is to hand-edit the Cargo.toml which will result in a checksum error.

epage avatar Apr 29 '24 20:04 epage

So, if I understand correctly, this only "fixes" #5806 for crates published to crates.io, but e.g. crates vendored from a git repo, will still have the problem.

glandium avatar Apr 29 '24 20:04 glandium

From my understanding cargo vendor effectively runs cargo package on every dependency, regardless of the source. This change affects cargo package.

I tried this out locally by running cargo vendor on https://github.com/crate-ci/cargo-release/ which has git dev-dependencies. The vendored versions of those git dev-dependencies had the standard "this is from cargo package" boilerplate at the top and they had all of the auto* fields set to false from this change.

So I believe this fixes it for git as well. Feel free to test it when this hits nightly and let us know if it works or not!

epage avatar Apr 30 '24 02:04 epage

I tested 1.80.0-nightly (c987ad527 2024-05-01), and vendoring indeed adds the auto* and build fields (and more).

glandium avatar May 02 '24 06:05 glandium