globwalk icon indicating copy to clipboard operation
globwalk copied to clipboard

skip walking irrelevant directories when matching globs

Open robinfriedli opened this issue 3 years ago • 3 comments

  • for example, when given the glob "src/resources/templates/*.html", it is unnecessary to walk through any directories in the working directory other than src/
    • to skip irrelevant directories the iterator now checks whether the match result is Match::None when matching a directory and decides to skip the directory if it doesn't match any parents of any globs
    • a second Override stores all globs for all parent paths for all globs provided by the user
    • refs issue #29

robinfriedli avatar May 12 '21 01:05 robinfriedli

Overall looks good. Note that thos does break patterns that have a slash inside an or-group:

a/{b/c,b/g}/*.gif

Not 100% sure what's the right solution, probably to normalize it into several paths (i.e. one path for each group member).

Not sure how multiple or nested groups act in ignore.

Another option is that if you detect a non trivial case, replace it with ** (it's fine since it's an optimization, we can also make it better later)

Gilnaa avatar May 13 '21 20:05 Gilnaa

FWIW that's the test case:

#[test]
    fn test_florp() {
        let dir = TempDir::new("globset_walkdir").expect("Failed to create temporary folder");
        let dir_path = dir.path();
        create_dir_all(dir_path.join("a/b/c")).expect("");
        create_dir_all(dir_path.join("a/b/g")).expect("");
        create_dir_all(dir_path.join("a/e/f")).expect("");

        touch(
            &dir,
            &[
                "a[/]b[/]c[/]lib.rs",
                "a[/]b[/]g[/]lib.rs",
                "a[/]e[/]f[/]lib.rs",
            ][..],
        );

        let expected: Vec<_> = [
            "a[/]b[/]c[/]lib.rs",
            "a[/]b[/]g[/]lib.rs",
            "a[/]e[/]f[/]lib.rs",
        ]
        .iter()
        .map(normalize_path_sep)
        .collect();

        let patterns = [
            "a/{b/c,b/g}/*rs",
            "a/e/**/*rs",
        ];
        let glob = GlobWalkerBuilder::from_patterns(dir_path, &patterns)
            .build()
            .unwrap();

        equate_to_expected(glob, expected, dir_path);
    }

Gilnaa avatar May 13 '21 20:05 Gilnaa

Yeah any iterator provided by std::fs is probably not gonna to cut it, at least not without some hacky business. We might want to parse the glob ourselves and ignore path separators that appear in any kind of glob expression ("[]" as well maybe though I'm not sure where path separators are even allowed to appear, I don't presume escaping path separators is valid for instance and slashes probably don't get matched in [] anyway). Might be sufficient to just leave components containing glob expressions as is and ignoring path separators until the expression ends. Windows file paths always give me a headache though.

robinfriedli avatar May 13 '21 22:05 robinfriedli