fd icon indicating copy to clipboard operation
fd copied to clipboard

Discussion: show Git-ignored files by default?

Open sharkdp opened this issue 4 years ago • 63 comments

Since fd was first published, the feature to hide Git-ignored files by default has always been controversial. It's the number one pitfall for new users, as witnessed by the numerous issues that have been opened over time (even though this is the first point in the Troubleshooting section). Even experienced users will likely run into this from time to time.

We have had past discussions about this (see #179, #220, #18), but I'm not so sure anymore if this default is the best possible option for the "average user".

I thought it might make sense to discuss this again and see what others think. Whatever we choose as the default, it will always be easy for users to select a different default via an alias.

Pro current behavior (do not show .gitignored entries by default):

  • Most searches are faster if we take .gitignore files into account. .gitignored directories tend to contain huge amounts of automatically generated build artifacts or downloaded dependency files. Pruning these directories from the search tree typically results in a faster search overall. There are counterexamples to this where the parsing of long .gitignore files takes longer than actually traversing these directories.
  • Most of the time, .gitignored results are not "interesting" to the user (however, see counterpart below).
  • When running fd without any arguments, I typically don't want to see .gitignored files.

Cons:

  • It can be very confusing to (new) users. If 10% of users go so far as to create a ticket on GitHub to ask about their problem, there must be hundreds of users that ran into this problem at some point.
  • Even if you know about the default, it can be annoying to repeat the search because you forgot to add -I or -u. There are a lot of valid use cases where users are - in fact - interested in results from ignored directories or files. Personally, I would estimate that I use -uu or -HI in roughly 20% of my searches, which is quite high.

sharkdp avatar Jun 07 '20 14:06 sharkdp

I want to add that the nature of the files in .gitignore depends a lot on the nature of our projects. In my case for example most of the time the ignored files are files with sensitive information (not crap) that I want to be able to search with fd.

But I understand that for other people often the files in .gitignore have to be ignored by fd as well.

For this reason the desired default behavior is not the same for everyone.

In my opinion, the default behavior should be "search all", because it's easier to figure out why there are too many results than it is to figure out why there are missing results.

But such a change in behavior will not be backward compatible, which is never good. To overcome this, people must be allowed to easily return to the old default behavior. Hence the need to be able to configure the default behavior of fd (#362).

kpym avatar Jun 07 '20 16:06 kpym

because it's easier to figure out why there are too many results than it is to figure out why there are missing results.

I think this is an excellent point.

But such a change in behavior will not be backward compatible, which is never good.

Agreed. This is probably the main reason why I never changed the behavior. Still, if it should turn out that 90% of users would like a different default, I'm more than willing to make a breaking change.

To overcome this, people must be allowed to easily return to the old default behavior. Hence the need to be able to configure the default behavior of fd (#362).

Okay, I have reopened the ticket once again. Let's discuss this aspect in #362. There are other ways to configure the default behavior as well (aliases, wrappers, environment variables).

sharkdp avatar Jun 07 '20 17:06 sharkdp

Personally, I think the pros outweigh the cons. But setting up an alias is easy, so I wouldn't be too upset if this changed.

tmccombs avatar Jun 07 '20 18:06 tmccombs

I am a new user and think this is a cool feature but I also think it should be the default as it is not obvious from any short description and is not intuitive.

"Powerusers" could easily setup an alias as was commented earlier.

Weker01 avatar Jun 08 '20 17:06 Weker01

I would add my vote to search all files except hidden files by default.

jonathan-s avatar Jun 08 '20 20:06 jonathan-s

Hello.

I'd like to make a point that fd is a general-purpose file-searching utility that is not git specific, so having it to take into account .gitignore files, laying around in the filesystem, does not feel right. In fact, I've stumbled upon this issue the very first time I tried fd: I've tried to find something, starting from a non-git directory in subdirectories which happened to be git repos and found nothing, although I knew it was there. After that, the very next thing I did is patched fd locally, so it wouldn't read .gitignore files by default.

et342 avatar Jun 23 '20 21:06 et342

The dilemma

I think that for this issue and for #362 the question is:

Should fd behaves as "general" or "git-style" tool ?

What I mean by this is summarized in the following table

general tool git style tool
examples find,grep git,rg
ignore files no yes
configuration flags only flags, files, environment variables

Actually fd is in between the two worlds. And respecting the ignore files without the configuration feature is bad, IMO. So fd should choose the red pill or the blue pill ;)

My proposal

Or may be we don't have to choose and we can have both.

I think that :

  • fd should be a general tool (by default)
  • there should be a flag --git that makes it act in a "git style"
  • fdg should be another compiled version from the same code but with different default behavior, that is equivalent to fd --git and heaving --no-git flag to switch it to "general tool".

My arguments for creating the additional fdg are the following:

  • it should be easy to make the automation tool to build two versions, one with --no-git option as default, the other with --git as default.
  • the alias solution discussed already in #362 is very "shell specific" :
    • there is no general alias that can work in all shells (powershell and cmd included)
    • every time we download fd we should setup the aliases, so the advantage of "no install, just download and use it" is not valid any more
    • aliases are not working when you wan to make a general script (let's say in python) using fd
  • this solve the backward compatibility with ignore files as any user can decide simply to switch to fdg or even to rename fdg fo fd and continue using it as before (respecting the ignore files by default)
  • this allows to solve not only this issue but also #362 and all other issues that ask to change the default behavior of fd (this would be done using the configuration file in fdg)
  • having both versions will make the supporters on both sides ("pro-general" and the "pro-git") happy

kpym avatar Jun 24 '20 06:06 kpym

I like that idea, with one caveat. Instead of having a separately compiled version of fd, fdg should just be a symlink to fd, and fd check the name that it is called with, and if it is "fd" use the general behavior, and if it is "fdg" use the "git" behavior. Or alternatively distribute OS-specific wrapper scripts for fdg (for example that does something like exec fd --ignore, or whatever the windows equivalent is).

tmccombs avatar Jun 24 '20 07:06 tmccombs

@tmccombs The problem with all workarounds : aliases, scripts, symlinks, ... are the same :

  • they are all system/shell specific
  • in general they are ok in the linux world but not in windows
  • you should configure all this before to start using the tool
  • they can generate additional problems that will create more issues here

Did you have arguments against heaving fdg in addition ? Once the CI/CD tool configured to compile both tools (for all systems) this do not create more work for the maintainers.

The only drawback I see is that having two versions can confuse the novice user which one to take. But this can be easily solved by promoting fd and talking about fdg only in a section at the end of the README.

kpym avatar Jun 24 '20 08:06 kpym

Sorry, I am suggesting that the symlink or shell would be distributed with fd. Having separate, but nearly identical binaries means the package is larger (longer to download and more space on disk). fd is small enough it's probably not that big of a deal for most people, but it still feels wrong to me.

Once the CI/CD tool configured to compile both tools (for all systems) this do not create more work for the maintainers.

It means builds take longer, which affects anyone who builds it from source.

tmccombs avatar Jun 24 '20 08:06 tmccombs

Sorry, I am suggesting that the symlink or shell would be distributed with fd.

This means that all scripts should be tested and maintened and they will raise more issues here. I have very bad experience with the provided scripts for windows for example with conda (python package manager). Providing additional scripts is the end of "single executable tool".

Having separate, but nearly identical binaries means the package is larger (longer to download and more space on disk). fd is small enough it's probably not that big of a deal for most people, but it still feels wrong to me.

I don,t know what you call "the package". The builds are larger, but the sources are not. And this argument is valid for all os specific builds. If we want smallers builds it is easy : provide only one (or zero) builds and tell the peopel to build their own, but this is not very user firnedly.

Once the CI/CD tool configured to compile both tools (for all systems) this do not create more work for the maintainers.

It means builds take longer, which affects anyone who builds it from source.

No. You build only the version that you need. This is already the case for all os specific builds. Only the CI/CD builds all the versions.

And if we have to choose only one type of build (general vs git style) which one this should be ? Giving a good solution only for one category of users and telling to the other category "get by with scripts" is not very user friendly, IMO.

kpym avatar Jun 24 '20 08:06 kpym

This means that all scripts should be tested and maintened and they will raise more issues here. The script I'm proposing is extremely simple, it would just call fd with an option and the arguments passed to the script. If we had a separate executable, that would have to be tested and maintained too :).

I don,t know what you call "the package".

I mean the zip, tarball, or deb you download from the releases page, or a package you install with a package manager like apt, dnf, pacman, homebrew, winget (maybe?), etc.

Providing additional scripts is the end of "single executable tool".

From your next paragraph, I think what you mean by this is that you need to install more than just a single executable. But afaik, this is not a design goal of fd. And fd is not currently distributed as just an executable. all packages include command completions scripts, and on linux they also include man pages.

The builds are larger, but the sources are not. And this argument is valid for all os specific builds. If we want smallers builds it is easy : provide only one (or zero) builds and tell the peopel to build their own, but this is not very user firnedly.

So you're saying we would have twice as many packages on the release page and in package managers/app stores? I really don't like that idea. I think it would make it even more confusing for users to know which package to install/download, and it roughly doubles the amount of work for packagers to maintain the package for package managers like apt, dnf, pacman, etc.

No. You build only the version that you need.

You're assuming no one would want both. This also means complicates building from source, since the user now needs to be able to configure which version they want to build somehow (the OS they want is implicit in the OS they are running the build from, unless they are cross-building).

And if we have to choose only one type of build (general vs git style) which one this should be ? Giving a good solution only for one category of users and telling to the other category "get by with scripts" is not very user friendly, IMO.

That is not what I'm suggesting.

If using a script really is such a problem for windows, why not the symlink/command name approach?

fd would have something like:

let cmdName = std::env::args().next().unwrap();
if cmdName.ends_with("fdg") {
  // use git-style default
} else {
  // use "general" default
}

Then on unix-like systems, the package would just have a symlink from fdg to fd (or vice versa). If windows doesn't have an equivalent it could just have a copy of fd called fdg, or have a seperate package if that makes more sense, and/or have instructions to rename the file in the readme depending on what functionality you want.

tmccombs avatar Jun 24 '20 16:06 tmccombs

We already have ~/.fdignore. Maybe this file could somehow 'include' gitignore (via something like @~/.gitignore or some other character/directive)? With this approach showing git-ignored files could be enabled via default, also allowing user to add his git-ignored entries that are already in ~/.gitignore (or ./.git/ignore when inside repository) in an easy way?

Personally, I wasn't even aware that git-ignored files are omitted: https://imgur.com/a/UlLD8ED For now my .fdignore contains mostly 100% of .gitignore + other patterns. It would be great to be able to 'include' the file as a whole, not to copy it's content manually.

ftpd avatar Jul 14 '20 15:07 ftpd

Maybe this file could somehow 'include' gitignore (via something like @~/.gitignore or some other character/directive)

respecting .gitignore is more than just respecting a global ~/.gitignore file though. It's also uses .gitignore files in ancestor directories, and descendent directories (scoped to those directories). You could have some syntax in .fdignore to enable --ignore-vcs, but that isn't really an include, so much as a flag. And we would still probably want a way to disable that with --no-ignore-vcs on the command line to overrule that, but keep the other ignores from .fdignore, which seems kind of weird to me.

tmccombs avatar Jul 14 '20 23:07 tmccombs

FWIW, ripgrep (which I imagine a lot of fd users also use) also respects gitignore by default.

joshuarli avatar Jul 21 '20 02:07 joshuarli

Thought dump:

git is primarily concerned about the contents of files, their state, but not their presence. This means that .gitignore files are also about the state of the files, but not their presence.

ripgrep, just like git, also primarily concerned about the contents of files, and this shared concern makes its choice of consideration of .gitignore files understandable, although it could also be opt-in.

fd, on the other hand, is not concerned about the state of the files, but is concerned about their presence, what differs from concerns of git and ripgrep, what makes its consideration of .gitignore files slightly less fitting.

et342 avatar Jul 21 '20 03:07 et342

fd, on the other hand, is not concerned about the state of the files, but is concerned about their presence

That's a good way of looking at it, my stance is pretty neutral now. As a long-time user I'd have no problem making an inverse alias to fdh: aliased to fd --no-ignore-vcs -H -E '.git/', but I still feel there's value in performant (I assume most fd users have git repos in their filesystems) defaults. Plus, it's already established and would take some effort to flip.

joshuarli avatar Jul 21 '20 04:07 joshuarli

Just to throw my opinion into the ring. I'm in favor of changing the default.

When I use fd with no options/arguments other than a pattern, 99% of the time I'm just using it to quickly narrow down the list of files I need to look at to find what I want. In that case I'm okay if I get some things that I don't care about in the search, but I'm much more annoyed if I don't find something that's actually there because I forgot to specify that I wanted to search .gitignored files as well. @sharkdp said that he adds -H or -u around 20% of the time, meaning those flags mattered 20% of the time. But I'm willing to bet that if those flags were enabled by default then they would have to be disabled much less than 20% of the time.

Also, from a scripting/reducing noise perspective, normally when when I'm doing something more precise than just quickly narrowing down a list of files, I'm more willing to add flags and check the documentation in order to narrow down the search results to be only what I care about.

And concerning adding an fdg binary (or symlink), I don't see how that's better than just adding an optional flag. It feels like it would complicate CI and packaging a lot for something that essentially just flips a flag on by default.

elihunter173 avatar Jul 31 '20 23:07 elihunter173

I have an additive suggestion, which could leverage or make the suggestion obsolete: Add an according description to tldr. rg/ripgrep has a description rg -uu pattern, which is the second result and thus searchable in 1s.

20% typing the thing would then overall still mean less time. Bonus is that -uu could be established as use hidden github stuff or "do more work".

One client for tldr is tealdeer.

matu3ba avatar Aug 02 '20 02:08 matu3ba

Some thoughts on a couple of points brought up in this thread, though nothing terribly new.

  1. Conflating git ignore with general ignore

I'm also in favour of changing the default, because IMO paths being present .gitignore mean literally what it says on the tin - "not interesting for the purposes of version control" and I wish that tools like fd as well as ripgrep did not overload this definition to mean roughly "not interesting to search/scan in by default". I, like many others here, have had false negatives due to this. It's (subjectively!) a bit sad that in order to be confident in a negative search result one has to either provide extra flags, or rerun the search using a "legacy" tool.

  1. Special treatment of .gitignore with regards to other similar files

fd, or to be more precise the ignore crate that it uses, appears to only support git's ignore files, which means fd's behaviour for say Mercurial users will be different. Firefox, arguably the poster child for Rust, uses Mercurial for example.

  1. Prior art re: aliases/separate binaries

This applies to grep, and is likely Linux (or perhaps even Debian) specific:

root@9fb4e89aea1b:/# man grep | head -4
GREP(1)                                                                User Commands                                                               GREP(1)

NAME
       grep, egrep, fgrep, rgrep - print lines that match patterns

So, using aliases for commonly-used flags is at the very least nothing new.

kvelicka avatar Aug 05 '20 09:08 kvelicka

Some thoughts on a couple of points brought up in this thread, though nothing terribly new.

1. Conflating _git_ ignore with general ignore

I'm also in favour of changing the default, because IMO paths being present .gitignore mean literally what it says on the tin - "not interesting for the purposes of version control" and I wish that tools like fd as well as ripgrep did not overload this definition to mean roughly "not interesting to search/scan in by default". I, like many others here, have had false negatives due to this. It's (subjectively!) a bit sad that in order to be confident in a negative search result one has to either provide extra flags, or rerun the search using a "legacy" tool.

1. Special treatment of `.gitignore` with regards to other similar files

Did you ever run grep on repos with huge binary files (>5 GB) or big amount of files ignored by .gitignore ? Especially binary data (without newline) use linear time and that is why ripgrep has another default than grep. For usage for fd of many, many files inside .gitignore ie compiling Linux Kernel the same argument can be made.

fd, or to be more precise the ignore crate that it uses, appears to only support git's ignore files, which means fd's behaviour for say Mercurial users will be different. Firefox, arguably the poster child for Rust, uses Mercurial for example.

Argument of authority is no technical argument on usage. And you cant make everyone happy for using the tool. Here a short catalogue for decision making:

  1. Usage consistency What type of consistency to other tools (arguments + effects) should be used? (for me that is Rust and ripgrep, if possible)?
  2. Usage purpose Ignoring build files has the purpose of supporting dev environments, where you frequently want to search for (relative) filepaths in a complex tree. (effect of clear speed win as less paths and files need to be traversed)
  3. Oriented user base If the author chooses to support such thing, when should a version control tool be supported? (market share, user base?)
  4. Clarification How should it be documented? (manual, cheat sheet, tealdeer,tldr )
  5. Technical feasability How many build files can be "hosted" (as result of codegen) on Mercurial to justify ignoring them?
  6. Usage feasability How many build files are "hosted" (as result of codegen) on Mercurial to justify ignoring them?
1. Prior art re: aliases/separate binaries

This applies to grep, and is likely Linux (or perhaps even Debian) specific:

root@9fb4e89aea1b:/# man grep | head -4
GREP(1)                                                                User Commands                                                               GREP(1)

NAME
       grep, egrep, fgrep, rgrep - print lines that match patterns

So, using aliases for commonly-used flags is at the very least nothing new.

How should this be maintained and name-clashes prevented ? cfdisk, df, efi,rfkill are already used. Do you have specific names in mind?

matu3ba avatar Aug 05 '20 14:08 matu3ba

This applies to grep, and is likely Linux (or perhaps even Debian) specific:

root@9fb4e89aea1b:/# man grep | head -4
GREP(1)                                                                User Commands                                                               GREP(1)

NAME
       grep, egrep, fgrep, rgrep - print lines that match patterns

So, using aliases for commonly-used flags is at the very least nothing new.

On Ubuntu at least, egrep, fgrep and rgrep are simply shell scripts that run grep with the given options. I would be ok with that, or with a symlink approach. But I'd rather not have seperate but nearly identical compiled binaries.

tmccombs avatar Aug 05 '20 16:08 tmccombs

(repo maintainer - this reply certainly cuts close to being off-topic, please feel free to remove it)

Did you ever run grep on repos with huge binary files (>5 GB) or big amount of files ignored by .gitignore ? Especially binary data (without newline) use linear time and that is why ripgrep has another default than grep. For usage for fd of many, many files inside .gitignore ie compiling Linux Kernel the same argument can be made.

  • grep ignores binaries by default as well
  • most repos don't contain 5GB binaries, nor nearly as many build artifacts as the LInux kernel

I'd like to refer back to the original issue, which talks about picking the best default for the "average user", not about removing support for ignoring based on .gitignore altogether. Overall, the ability to piggyback on .gitignore is tremendously helpful. However, I have been bitten by it myself and have seen others in the same situation, hence my personal stance of "don't look at .gitignore by default". It's an anecdotal account and you can absolutely find more valid critiques against it, but I don't think that's gonna advance the overall discussion much more.

Argument of authority is no technical argument on usage. And you cant make everyone happy for using the tool. Here a short catalogue for decision making:

My argument (2) is definitely a weak one, and you're right that it's impossible to make everyone happy (e.g. look at us having this very discussion!). I mentioned it because fd (ripgrep is arguably more guilty, but that's offtopic) talks about version control ignore files in a general sense, but in fact supports only git. From fd --help:

  • "<...>that would otherwise be ignored by '.*ignore' files"
  • the very name of --no-ignore-vcs flag

Again, this is only a quibble.

How should this be maintained and name-clashes prevented ? cfdisk, df, efi,rfkill are already used. Do you have specific names in mind?

I don't have any specific suggestions here. My aim is only to highlight that using ~~aliases~~ shell scripts to invoke certain flags is something that's already ships with popular Linux distros.

kvelicka avatar Aug 05 '20 20:08 kvelicka

fd (ripgrep is arguably more guilty, but that's offtopic) talks about version control ignore files in a general sense, but in fact supports only git

I believe this is for forwards compatibility. So if at some future point the ignore crate adds support for additional VCS systems (such as mercurial), then the --no-ignore-vcs flag will just work, without having to add a new --no-ignore-hg flag and similar for each VCS system that is added.

tmccombs avatar Aug 05 '20 22:08 tmccombs

* grep ignores binaries by default as well

* most repos don't contain 5GB binaries, nor nearly as many build artifacts as the LInux kernel

Not, when you did not write a binary-data header or alike.

My argument (2) is definitely a weak one, and you're right that it's impossible to make everyone happy (e.g. look at us having this very discussion!). I mentioned it because fd (ripgrep is arguably more guilty, but that's offtopic) talks about version control ignore files in a general sense, but in fact supports only git. From fd --help:

* "<...>that would otherwise be ignored by '.*ignore' files"

* the very name of `--no-ignore-vcs` flag

Again, this is only a quibble.

True. @sharkdp Why is checking.*ignore* for files/folders to ignore not possible? The filepaths needs to be checked against the .gitignore anyway or is there a limit to 1 or 2?

I don't have any specific suggestions here. My aim is only to highlight that using ~aliases~ shell scripts to invoke certain flags is something that's already ships with popular Linux distros.

I dont like them at all and prefer shell aliases.

matu3ba avatar Aug 05 '20 23:08 matu3ba

If you do make this change, please consider using separate flags for .gitignore, .fdignore, etc. I have run into valid use cases for (observe .fdignore, ignore .gitignore) and visa-versa.

Examples: -I/--ignore -- Ignore file patterns in .gitignore and .fdignore -Ig/--ignore-gitignore -- Ignore file patterns in .gitignore -If/--ignore-fdignore -- Ignore file patterns in .fdignore -N/--no-ignore -- do Not ignore file patterns in .gitignore and .fdignore -Ng/--no-ignore-gitignore -- do Not ignore file patterns in .gitignore -Nf/--no-ignore-fdignore -- do Not ignore file patterns in .fdignore

Unfortunately these all use double-negatives, and there is a potential confusion about the double-meaning of "ignore .gitignore" (ignoring the .gitignore file and ignoring the files within it have opposite meaning). Other terms that may be less confusing:

  • E/N - --Exclude-from (exclude files listed in .{}ignore), do Not exclude
  • U/N - Use, do Not use

There is precedent for fine-grained ignore params in ripgrep: (--no-ignore-dot, --no-ignore-global, --no-ignore-vcs, etc.)

[ If the above comment about supporting non-git repositories is implemented, then Ig might become Iv (for vcs) ]

tobiww avatar Aug 19 '20 03:08 tobiww

I agree with having more granular control over ignore files. However, I'd rather use flags consistent with what ripgrep uses, both because of consistentency and because I think they are a little less confusing.

tmccombs avatar Aug 19 '20 04:08 tmccombs

I think the point of having a tool like this is that it's opinionated. If I have to start adding flags to reach the default behavior/length of find, I might as well use find

Like rg and the rest of the modern tools, what makes them great is their defaults. If the only advantage is a very minor speed bump, people would just use the preinstalled tools they already know.

The fact that it ignores hidden and gitignored files is in the main bullet point feature list. If one doesn't bother to read that...

fregante avatar Sep 09 '20 17:09 fregante

Wow, thanks for pinning this issue. I manage my dotfiles by creating a git repo in ~ and then adding * to ~/.gitignore. I was so confused at why I wasn't getting any results in my home directory.

I think we should follow the convention from grep and git grep:

  • fd does not exclude files from .gitignore
  • git fd excludes files from .gitignore

Workarounds: I need to make sure that any time fd is invoked (by any script!) it's called with --no-ignore-vcs. I'd love to be able to set this in a configuration file or environment variable.

christianbundy avatar Oct 29 '20 21:10 christianbundy

Wow, thanks for pinning this issue. I manage my dotfiles by creating a git repo in ~ and then adding * to ~/.gitignore. I was so confused at why I wasn't getting any results in my home directory.

I think we should follow the convention from grep and git grep:

* `fd` does not exclude files from `.gitignore`

* `git fd` excludes files from `.gitignore`

Workarounds: I need to make sure that any time fd is invoked (by any script!) it's called with --no-ignore-vcs. I'd love to be able to set this in a configuration file or environment variable.

This is not a convention. git grep is a subcommand of git while grep is a completely different programm.

Weker01 avatar Oct 29 '20 23:10 Weker01

This is not a convention.

This is a convention -- the git-grep binary is provided by the Git project, but from the perspective of a user it's clear that git-grep takes into account .gitignore whereas grep does not. I'm suggesting that Git-specific usage should be provided under git-fd rather than fd. In summary:

  • foo -- should aim to be as general-purpose as practical
  • git-foo -- should aim to be as Git-specific as practical

git grep is a subcommand of git while grep is a completely different programm.

Yes, I agree.

christianbundy avatar Oct 30 '20 01:10 christianbundy

fd does not exclude files from .gitignore
git fd excludes files from .gitignore

I have a few issues with this:

  1. git grep works very differently than fd. While git grep runs grep on each file that is checked in (basically equivalent to git ls-files | xargs grep), fd does file traversal itself, in parallel, which s where you get a lot of the speed.
  2. git subcommands are generally implied to be operating on a single git repository. But I sometimes want to use fd (and rg) in a directory that contains multiple git repos and still respect .gitignore files.
  3. It means adding a new executable that needs to be installed, and two levels of indirection when calling it.

In short, I don't think it is any better of an option than making --no-ignore-vcs the dfault rather than --ignore-vcs.

tmccombs avatar Nov 02 '20 04:11 tmccombs

In case it helps, I thought fd was broken while searching for something I knew was in my node_modules directory due to this.

aral avatar Feb 20 '21 21:02 aral

Windows 7 user here. I have a dedicated folder with CLI tools added to PATH variable. For example, there is ripgrep with .ripgreprc next to it, which contains settings I need with every launch like --smart-case or coloring preferences. I like the portability of this approach instead of polluting %UserProfile%, registry or creating more env variables. So it would be nice to have similar configuration here. I would use it to make -H permanent, because I always forget to add it (find shows all).

sergeevabc avatar Apr 02 '21 06:04 sergeevabc

Just adding my experience here that this caught me off guard multiple times. Most users install fd as a replacement for find, so it can be surprising when it doesn't show certain files by default.

fats avatar Apr 02 '21 15:04 fats

Adding this to the "fd 9" milestone because I would like to settle this discussion and introduce the (possible) breaking change in that version (see #613).

sharkdp avatar Aug 08 '21 14:08 sharkdp

My suggestion would be to not change the default, so we don't break anyone's workflow. Instead, how about something like this:

$ fd -e o
(9 ignored files skipped, 2 hidden files skipped; see fd -h for details)

With #595 implemented, users could make fd an alias to fd --no-hidden --ignore to keep the current behaviour and suppress the warning, or to fd -HI to show everything by default.

I'd be okay with always printing that warning, even if there are matches. But especially if there are no matches it might be handy.

tavianator avatar Aug 10 '21 13:08 tavianator

If directories are skipped would it count those a 1?

tmccombs avatar Aug 10 '21 14:08 tmccombs

It would yeah. Maybe the exact number isn't important though, how about "some ignored files and hidden files were skipped"?

tavianator avatar Aug 10 '21 14:08 tavianator

I'm in favor of @tavianator's suggestion. For me, the current default makes perfect sense because I'm generally not interested in searching ignored files. The other mentioned proposals (two separate binaries, configuration files, a lot more flags) are, in my humble opinion, the opposite of simple and user-friendly ways to deal with this "problem". To me, it feels like breaking a butterfly on a wheel. Just let the user know if and how many ignored or hidden files have been skipped. Informative, straightforward, easy to implement.

pemistahl avatar Aug 10 '21 17:08 pemistahl

Just let the user know if and how many ignored or hidden files have been skipped.

I would also be in favour of this. Also inform the user how they can make sure not to skip these files.

jonathan-s avatar Aug 10 '21 19:08 jonathan-s

easy to implement

Well :smile:. I'm not so sure about this. We currently use the ignore crate for parallel directory traversal + gitignore handling. I don't think it (currently) has a mode where it gives us all files, but marks that ones that would be ignored (or similar). But I might be wrong.

The bigger problem I see is with performance. If we want to show this warning, it would mean that we will always have to parse gitignore files, even in the presence of -I/--no-ignore. This can potentially result in significant performance regressions for (no-ignore) searches.

sharkdp avatar Aug 10 '21 19:08 sharkdp

Well smile. I'm not so sure about this. We currently use the ignore crate for parallel directory traversal + gitignore handling. I don't think it (currently) has a mode where it gives us all files, but marks that ones that would be ignored (or similar). But I might be wrong.

Yeah, might require a patch to ignore. We don't need to know what paths they were, or even how many, just whether it ignored anything.

If we want to show this warning, it would mean that we will always have to parse gitignore files, even in the presence of -I/--no-ignore.

I don't think we need to show the warning when -I is passed, at least about ignored files. We could still warn about hidden files unless -H is passed.

tavianator avatar Aug 10 '21 19:08 tavianator

I don't think we need to show the warning when -I is passed, at least about ignored files.

Oh - of course! In this case that's a route that we should explore :+1:

sharkdp avatar Aug 10 '21 19:08 sharkdp

I like that idea as well. Although for scripting you probably don't want that in the output. But I think writing it to stderr would probably be ok, and stderr can be redirected to /dev/null if desired.

tmccombs avatar Aug 11 '21 05:08 tmccombs

My 2 cents: if fd is advertized as an alternative to find, it should behave as find as closely as possible. I was initially confused as well.

99% percent of the times I want to ignore git files, but still I would prefer if fd searched everything by default. It's fd not fdg after all. The lack of command line options should default to the most general and inclusive behaviour, imho. It's how most command line tools behave I think.

For example I'd prefer something like:

fd .           # don't ignore anything, as find
fd . -i=fig    # respect .fdignore, .ignore and .gitignore
fd . -I        # respect all ignore files

mg979 avatar Oct 02 '21 03:10 mg979

Although I'm in favor of fd searching gitignored files by default, fd already makes a fairly radical break from find by switching the order of parameters.

omentic avatar Oct 02 '21 04:10 omentic

Another thing to add to my suggestion, having just debugged #876: perhaps if errors occur during the search, that should be noted as well. Something like

$ fd certbot.pem /etc
(17 errors occurred, results may be incomplete; pass --show-errors for details)

tavianator avatar Nov 04 '21 16:11 tavianator

If this default changes, I would humbly request that an inverse CLI flag allows us to override previous CLI flags.

For example...

# Case-insensitive
fd -s -i

# Case-sensitive
fd -i -s

# Will ignore 
fd --no-ignore --ignore

# Will not ignore
fd --ignore --no-ignore

This way folks can easily choose their default via an interactive shell alias, but still have the option e.g.

alias fd="fd --ignore"

jchook avatar Nov 16 '21 00:11 jchook

@jchook This should be done already by https://github.com/sharkdp/fd/pull/822

tavianator avatar Nov 16 '21 02:11 tavianator

I also like @tavianator proposal, possibly with the following caveats:

  • fd should print a warning only if outputting to stdout in an interactive shell
  • The warning should not include counts for performance reasons but simply state:
(Rerun with `-u` to also search ignored files, `-uu` to search all files)

andreavaccari avatar Nov 16 '21 23:11 andreavaccari