cargo
cargo copied to clipboard
Cargo should warn when "include" doesn't match any files, or filters files from parent directories
Problem
It appears that cargo package with the include directive does not warn when it does not match a file, or when a file is from a parent directory. While I think the latter case is justifiable, I think it'd make sense to warn when these cases occur. This is similar to #7830 for license-file and #5911 for the readme directives.
Steps
Say we create a crate with the following steps:
% mkdir foo
% cd foo
% echo "hello world" > hello
% cargo new mycrate
% cd mycrate
% cat > Cargo.toml
cat > Cargo.toml
[package]
name = "foo"
version = "0.1.0"
edition = "2018"
include = ["../hello", "Cargo.toml", "src/*.rs"]
% cargo package --allow-dirty --list
If we then create a package with cargo package, cargo will silently ignore the included file from the parent directory:
% cargo package --list --allow-dirty
warning: manifest has no description, license, license-file, documentation, homepage or repository.
See https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata for more info.
Cargo.lock
Cargo.toml
Cargo.toml.orig
src/main.rs
Possible Solution(s)
I think we should match the same "emit a warning" behavior that was implemented for #7830.
Notes
Output of cargo version: cargo 1.53.0 (4369396ce 2021-04-27)
Is this a good "first issue"? If so I can try to take a crack at it this weekend. I assume changes will be similar to the license file PR linked above. I can see that it looks like this will be mostly an addition to the build_ar_list(...) method in cargo_package.rs. If I'm going about this totally the wrong way, let me know.
I've looked at this a bit more and I think it's unclear exactly what warning behavior we want with glob includes. Is the intent to match against the rule and give a warning if there are no matches? or should we check that the directory up until the wildcard exists and warn only if the folder that the rule identifies is missing?
I looked at how these globs are included in path.rs, and it's unclear to me how I might implement a "warn on glob not matching any files" behavior, unless we want to open up the private functions in path.rs or duplicate that functionality.
I'm unsure how difficult this will be. Essentially I think it should give a warning if any particular include rule does not match any files. Since these are gitignore rules, that might be a little tricky. The way I would start is to have a separate code path somewhere around ops::package_one that will iterate over the rules and check if any of them don't match any files. I think it would just need to create a GitignoreBuilder for each rule, iterate over all files from the package root, and check that each rule matches at least one file.
That's at least how I would start and see how it goes. Off the top of my head, I can think of a few tricky parts:
!rules may need special handling. I'm not sure how to handle those (maybe ignore them?).- We don't want this to be too slow, and there is risk walking the entire filesystem in a large workspace could be a lot of files. Care should be taken to avoid adding something that would be really slow.
- Can this use a simple
WalkDirto blindly walk the files from the package root, or should it be guided by git likePathSourcedoes? Using git is quite a bit more complicated, and I'm uncertain if it is feasible to share any code withPathSource. - Should
excluderules be ignored? I think it is maybe too risky to validate them as they could have a high chance of giving a false warning. Examples:- Rules like
exclude=["*.bak", "build"]to avoid various temp or build files. - Someone has an
excluderule for a git submodule, and they publish without initializing the submodule, there won't be any files to discover.
- Rules like
I was thinking something similar, but was unsure if we wanted to duplicate out more path walking functionality outside of path.rs since the behavior is similar. I can absolutely put together a first draft of appropriate behavior and we can iterate on that. Regarding specific points:
!rules could be checked if they change the output by doing a "normal" check without the!prefix, then adjust the error message to something along the lines of "includ pattern {} is excluding file {} but no file was found". We could do an additional check that the ignore pattern covers a more specific section than a previous include, and give a warning like "include pattern {} is attempting to exclude path {} as an override, but no prior rule for a parent of {} exists. Consider changing the include pattern or usingexclude"- I had similar concerns regarding performance as well, I imagine the performance of walking the file tree is going to be less significant, since in the case where we find any file, we can cease walking and not emit the error, and in the case where there are no files, ideally a corrective action is taken for successive runs of
cargo. Exactly how impactful these file walks will be on performance, I'm not sure. - My first thought is to match the walking behavior of the actual include logic as closely as possible. It would be unhelpful to emit warnings that don't line up with what we actually include later. I agree logic sharing with
PathSourcemay be tricky. I'm going to attempt to get things working, and I can refactor after a basic implementation if there is any code to re-use. - I agree that checking
excluderules is unlikely to be helpful. They often include defensive patterns and such warnings would likely be more trouble than they are worth.