glob
glob copied to clipboard
'*' broken in Pattern::matches_path
Here's a recursive file list:
$find src
src/
src/main.rs
src/a/
src/a/b.rs
Running this program
extern crate glob;
use std::path::Path;
use glob::{Pattern,glob};
fn main() {
let pattern="src/*.rs";
println!("matches src/a/b.rs: {}",Pattern::new(pattern).unwrap().matches_path(Path::new("src/a/b.rs")));
for entry in glob(pattern).unwrap() {
println!("globbed: {}", entry.unwrap().display());
}
}
outputs
matches src/a/b.rs: true
globbed: src/main.rs
Note it DID NOT output globbed: src/a/b.rs. Meaning glob and Pattern::matches_path are inconsistent here. I think glob is correct as '*' should not match across slashes.
I think glob is correct as '*' should not match across slashes.
This is configurable. Have you seen the MatchOptions struct?
It does look like there's some interesting stuff going on:
glob::globcallsglob::glob_with, whose documentation states: "The options given are passed through unchanged to Pattern::matches_with(..) with the exception that require_literal_separator is always set to true regardless of the value passed to this function."- The default value of
MatchOptionshasrequire_literal_separatordisabled, which is what's used when you callPattern::matches_path.
I think the above two things at least explain the behavior you're seeing, and since it's documented, I guess it's intended behavior?
One other interesting thing to note is that MatchOptions::new() and MatchOptions::default() will seem to produce different values?
Either way, I don't think * is broken.
I do see now that it's technically documented. But you have to read in various places. Note that both glob and matches/matches_path just say they use MatchOptions::new(). Regardless of what the docs say, I think this is extremely confusing default behavior for a globbing library.
@alexcrichton did you intend cargo to use require_literal_separator: false for package.include/package.exclude filters?
Regardless of what the docs say, I think this is extremely confusing default behavior for a globbing library.
I'm not so sure about that. For example, if I write *.rs, then I'd probably expect that to match src/foo/bar.rs.
(Interestingly, man gitignore specifies that * in *.rs will match a / but the * in src/*.rs won't. In other words, the behavior of * changes based on whether there's a literal path separator in the glob.)
I do agree that the current docs are confusing, and especially that new() and default() returning different values is confusing.
Ran into this as well. From https://man7.org/linux/man-pages/man7/glob.7.html :
Globbing is applied on each of the components of a pathname separately. A '/' in a pathname cannot be matched by a '?' or '*' wildcard, or by a range like "[.-0]".
Personally, I find this way more sensible. I think this should be made the default (which would be a breaking change though). Either way, the current situation should be documented more clearly as it deviates from what one might expect.