eza icon indicating copy to clipboard operation
eza copied to clipboard

Bug: Not Ignoring Subdirectories in directories

Open AliSananS opened this issue 11 months ago • 4 comments

Hello, everyone!

I'm loving this tool so much; I use the tree flag to show current directories & subdirectories. However, one thing I don't want it to show is what's inside the node_modules directory.

This is the command that I'm using for my ls alias:

eza -T -L 2 --ignore-glob "node_modules/*"

If I remember correctly, this command used to work and would behave as expected (i.e., it should show node_modules's existence but not its subdirectories), but now it doesn't after I updated my system.

Operating system: ArchLinux aarch64 eza: v0.18.4 [+git]

AliSananS avatar Feb 29 '24 14:02 AliSananS

Can you try out older versions of eza and see if you can find out where it broke, or e.g. do a git bisect to find what commit broke it?

cafkafk avatar Mar 01 '24 08:03 cafkafk

Can you try out older versions of eza

I have tried various versions even from 2021, nothing seems to work. (Of course not eza but exa what I was using earlier)

What I observed is it takes value for --ignore-glob flag quite literally. If I do:

eza -T -L 2 --ignore-glob "node_modules/"

It won't work as apposed to:

eza -T -L 2 --ignore-glob "node_modules"

Without forward slash.

I was using exa before.

Trying older versions making me question myself that was it really working that way. 😂

AliSananS avatar Mar 01 '24 11:03 AliSananS

I was looking into this as it seemed not so complicated to me, and I found out that currently, patterns for IgnorePatterns are built with glob::Pattern::new(), see here. This generates the following pattern:

Pattern { original: "node_modules/*", tokens: [Char('n'), Char('o'), Char('d'), Char('e'), Char('_'), Char('m'), Char('o'), Char('d'), Char('u'), Char('l'), Char('e'), Char('s'), Char('/'), AnySequence], is_recursive: false }

notice that it contains the last forward slash, this is true using node_modules/*, node_modules/ and node_modules/**. All of them generates the same tokens vector, but the original is exactly what you passed in. as usual.

The problem is that when we call .matches() on this pattern against the name of the directory that we pass to the function IgnorePatterns::is_ignored() here, it is always false as we seem to pass the name of the directory without any slashes. so node_modules is passed like this instead. and the pattern don't match.

Here is a code snippet that I used to test this:

fn main() {
    let pattern = glob::Pattern::new("node_modules/*").unwrap();
    let is_match = pattern.matches("node_modules");
    println!("{pattern:?}");
    println!("{is_match:?}");
}

I'm not sure how to fix this issue but I figured it would be good to share what I found here. I'm totally willing to fix this if we find a way to do this

wllfaria avatar Mar 01 '24 18:03 wllfaria

Ok, I went in the wrong direction with the first fix, and spent the last 2 days exploring how to solve this in a good way.

As mentioned above, when finishing a glob with / or /*, the pattern gets the last slash as part of it. and as of right now, at least from what I've seen from the code (you can correct me if I'm wrong, it would greatly help me maybe make this work). we retain only the directories that doesn't match the ignore globs. This mean that any directory or file that matches gets excluded altogether.

Inside of add_files_to_table we expand each directory that wasn't filtered, this is where the entire contents of the childrens gets expanded recursively, I made some naive tests by calling is_ignored on the path for each entry of the table, which would skip the expansion when the ignore-glob finished with a '/'. This indeed worked but it also felt too hacky.

this is how I tested: this is the file in question

// [SNIP]
fn add_files_to_table<'dir>(
// ...
) {
    if let Some(r) = self.recurse {
        if file.is_directory()
            && r.tree
            && !r.is_too_deep(depth.0)
            && !self.filter.should_skip_expansion(&file.name)
        {
            // ...
        }
    }
}
  /// checks if a dirname, when appended with a `/` matches any
  /// of the ignore patterns provided as argument.
  ///
  /// this only exists since when creating the patterns, any glob
  /// that ends with `/` or `/*` will keep the `/` on the pattern,
  /// and when listing directories, we display them without any `/`
  pub fn should_skip_expansion(&self, dirname: &str) -> bool {
      let dirname_with_slash = format!("{}/", dirname);
      self.ignore_patterns.is_ignored(&dirname_with_slash)
  }

I wanted to discuss how we could possibly make this change in a way that adheres to your vision of how eza should be, and if this is even a thing you want to support.

Personally, I don't think adding a '/' to check if it matches with a glob that ends with '/' or '/*' is any good. But by only removing the slash of the glob as I did before, we can only ignore the glob altogether, so the only straightforward way I found was doing this check. What do you think?

wllfaria avatar Mar 11 '24 15:03 wllfaria