Add `-vcs_ignore` flag via libgit2
Fixes https://github.com/tavianator/bfs/issues/42
This is entirely vibe coded, I don't really know C that well. It seemed to work, I didn't benchmark it (not really relevant for my use cases), I doubt this:
it should be an on/off thing regardless of where it appears on the command line.
is working. I'm posting this in hopes that someone takes it the rest of the way. Or, with guidance, I can do that too but I'm not gonna be guessing blindly here.
Thanks for the review.
Left some comments but there is a major design flaw with the current implementation. To work properly it needs to support multiple active repos simultaneously. If you want I can sketch out the ideas I had for that but it would be a major refactor.
Is your idea still using libgit2? Like checking for repo in each dir?
The way I use this tool, I don't care about nested repos, but I understand it's not as expected.
Also, is the whole #if BFS_WITH_LIBGIT2 needed/wanted, or can we just assume it exists?
Thanks for the review.
Left some comments but there is a major design flaw with the current implementation. To work properly it needs to support multiple active repos simultaneously. If you want I can sketch out the ideas I had for that but it would be a major refactor.
Is your idea still using libgit2? Like checking for repo in each dir?
libgit2 is fine. I was originally going to write my own .gitignore parser because I thought libgit2's license wasn't compatible with bfs, but actually it seems okay.
The hard part is to find somewhere to store the struct git_repository for every active path. Right now there are a couple different representations of active paths:
-
struct bftw_fileis private tobftw.c. There is a separate one for each path component, with parent pointers. -
struct BFTWis public, but there's only one alive at a time.
My idea is to unify these into something like struct bfs_path. This new type would chain like bftw_file, making it easy to hang extra data off each path component. So if you're at path/to/repo/sub/dir, you could grab the repo like path->parent->parent->repo.
The way I use this tool, I don't care about nested repos, but I understand it's not as expected.
I want bfs to work pretty much exactly like fd does regarding .gitignore support. If I merge an incomplete implementation that only supports one active repo, improving it in the future becomes a backwards-incompatible behaviour change. I try very hard to minimize those.
Also, is the whole
#if BFS_WITH_LIBGIT2needed/wanted, or can we just assume it exists?
Yes it's needed, all of bfs's dependencies are optional.
Hi, this iteration provided by Gemini 2.5 (and myself).
I ran it against my projects folder, and it provides the same result as running fd . [my_projects_dir] --exclude .git --hidden.
I noticed a couple of differences with fd, but those aren't bad:
- following a symlink into a dir with
.gitignorebut no repository doesn't respect the.gitignore(fd foes) -
.gitfolder is ignored automatically, unlikefd
The number of test cases could be increased to almost infinity (symlinks, repos, .gitignores everywhere, etc.), but I'm pretty happy with it based on real-world testing.
Looks like the case without libgit2 is failing, meaning these lines: invoke_bfs -quit -ignore_vcs || skip are not skipping. I now see there's a bunch of stuff like BFS_CAN_CHECK_ACL that I should implement for this flag, too.
Any comments before I go around fixing that?
I paid zero attention to the fact that there's other algorithms for traversing (bfs|dfs|ids|eds), and did not check compatibility with any of those so far.
Looks like the case without
libgit2is failing, meaning these lines:invoke_bfs -quit -ignore_vcs || skipare not skipping. I now see there's a bunch of stuff likeBFS_CAN_CHECK_ACLthat I should implement for this flag, too.
Right. You'll need to make it raise an error here: https://github.com/tavianator/bfs/pull/162/files#diff-041fe350636d21dd5c64d5bc8ea741855ea2f15a9dca7e0132a083fcd141ce54R1617
Any comments before I go around fixing that?
I'll give it a more thorough review soon.
I paid zero attention to the fact that there's other algorithms for traversing (bfs|dfs|ids|eds), and did not check compatibility with any of those so far.
I think it should be fine. bfs|dfs share most of the code, and ids|eds are just wrappers around dfs mostly.