noseyparker icon indicating copy to clipboard operation
noseyparker copied to clipboard

Make `scan --ignore FILENAME` apply to blobs in Git repositories

Open bradlarsen opened this issue 2 years ago • 3 comments

The scan command currently has a --ignore FILENAME option, which allows one to specify a gitignore-style rules files for paths to ignore when scanning. Those ignore rules are only applied to plain files that are scanned, and not blobs found within Git repositories. Those rules should also apply to Git blobs.

This is probably dependent on #16 being completed first.

bradlarsen avatar Dec 16 '22 05:12 bradlarsen

This feature could be useful when dealing with scanning monorepos on a per-project basis: https://github.com/praetorian-inc/noseyparker/discussions/119

bradlarsen avatar Jan 18 '24 16:01 bradlarsen

To implement this today, the most expedient approach:

Some complications:

  • It seems like the GitIgnore struct would have to be duplicated between the filesystem enumerator and git enumerator, since the ignore crate doesn't expose the one that it uses
  • There are some corner cases in the semantics. If a path cannot be determined for a blob for whatever reason, should there be a warning?
  • The best that Nosey Parker could do is filter against the pathname for a blob from the commit where it was first introduced. But a blob may have multiple different paths in its entire history; only the first pathname would be used when making the "should ignore?" decision for the blob.

bradlarsen avatar Jan 18 '24 16:01 bradlarsen

There is also a general oddity or surprising behavior about Nosey Parker's ignore rules. The ignore rules are .gitignore-style rules. The semantics of those rules are that they are relative to the directory that contains the .gitignore file. However, Nosey Parker uses this format to specify global rules: they are not intended to be directory-specific. The end result of this is that, essentially, all Nosey Parker ignore rules have to start with **/.

Perhaps the entire path-based ignore mechanism needs some rethinking in Nosey Parker.

bradlarsen avatar Jan 18 '24 16:01 bradlarsen