Gitignore folders are not supported
Describe the bug If you have something like this in the gitignore:
folder1/folder2/
scc still counts the files under folder1/folder2.
To Reproduce
- Add some files under
folder1/folder2, and runscs
Expected behavior It should support folders
Desktop (please complete the following information):
- OS: [e.g. macOS, Windows, Linux] Windows 10
- Version [e.g. 22] 3.0.0
Yeah its down to the lack of a reliable .gitignore parse for Go.
I am aware and it's being worked on.
Some time ago, I wrote a Gitignore to Glob converter in TypeScript. If there is no such library in Go, I can try porting it. I exactly followed the GitIgnore specification https://github.com/aminya/globify-gitignore
If you do @aminya ill be happy to use it.
I started implementing one on this branch https://github.com/boyter/scc/tree/issue178 where I wanted to add ** support in, but ran into issues and moved on to some other issues.
Having this as an external library that can be used by other projects would be extremely helpful! I have a few I would have use it instantly!
I have started porting it https://github.com/aminya/globify_gitignore. This is my first Go package, so any help is appreciated.
@boyter It is done! Could you test it and see if it works as expected: https://github.com/aminya/globify_gitignore
Nice. Looking into it now. Will have some feedback in a few days.
Have not forgotten this, but got distracted looking into alternative file walk Go implementations. I might try applying this on one of my other projects first though, probably https://github.com/boyter/dcd since it should be easier to implement there.
By the way, let me know if we should add a function that directly gives the list of files instead of just the glob pattern.
Sample test cases to use with this https://github.com/svent/gitignore-test
I have written enough tests to ensure the functionality of the library. But we can always add more. The repo you sent takes an interesting approach to test this. It requires some work for integration.
@aminya I tend to use a mix of unit and integration tests to ensure everything works as expected, and I can use it to compare against other tools such as rg, ag and git itself.
I have started here with the integration work, https://github.com/boyter/goignore which I would love you to have a look over as I am 100% certain I am doing something wrong. However the results are starting well with,
$ goignore
>>> .gitignore
2 Documentation/test.a.html foo: FAIL
2 Documentation/foo.html foo: OK
2 Documentation/gitignore.html foo: FAIL
2 Documentation/foo-excl.html foo: FAIL
2 foodir/bar/testfile foo: FAIL
2 dirpattern foo: OK
2 exclude/dir1/dir2/dir3/testfile foo: FAIL
2 file.o foo: FAIL
2 rootsubdir/foo foo: FAIL
2 subdir/subdir2/bar foo: FAIL
2 subdir/rootsubdir/foo foo: OK
2 subdir/hide/foo foo: FAIL
2 subdir/logdir/log/foo.log foo: FAIL
2 subdir/logdir/log/test.log foo: FAIL
2 subdir/logdir/log/findthis.log foo: FAIL
2 lib.a foo: FAIL
>>> .gitignore
5 git-sample-3/test foo: FAIL
5 git-sample-3/foo/test foo: FAIL
5 git-sample-3/foo/bar foo: OK
>>> .gitignore
2 htmldoc/jslib.min.js foo: FAIL
2 htmldoc/docs.html foo: OK
2 arch/foo/vmlinux.lds.S foo: FAIL
>>> .gitignore
3 arch/foo/kernel/vmlinux.lds.S foo: OK
2 log/foo.log foo: OK
2 log/test.log foo: FAIL
2 !important!.txt foo: FAIL
2 bar/testfile foo: OK
2 src/internal.o foo: FAIL
2 src/findthis.o foo: OK
compared to a test case,
$ rg ^foo: | sort
Documentation/foo.html:foo: OK
arch/foo/kernel/vmlinux.lds.S:foo: OK
bar/testfile:foo: OK
dirpattern:foo: OK
git-sample-3/foo/bar:foo: OK
htmldoc/docs.html:foo: OK
log/foo.log:foo: OK
src/findthis.o:foo: OK
subdir/rootsubdir/foo:foo: OK
$ git grep ^foo: | sort
Documentation/foo.html:foo: OK
arch/foo/kernel/vmlinux.lds.S:foo: OK
bar/testfile:foo: OK
dirpattern:foo: OK
git-sample-3/foo/bar:foo: OK
htmldoc/docs.html:foo: OK
log/foo.log:foo: OK
src/findthis.o:foo: OK
subdir/rootsubdir/foo:foo: OK
Which is not a bad start. Note that this is using the new Go file walk methods as well. The fail ones are the main ones I need to focus on.
If wondering, the reason I copied the code in there was it helps me with debugging when I want to look though things and potentially add a quick println.
Actually I note I get a lot of the following error (which is to be expected)
error parsing regexp: invalid nested repetition operator: `**`
As there are a lot of double ** in there. This is another bug I have been tracking https://github.com/boyter/scc/issues/178 which needs to be resolved. I suspect I am using your library incorrectly @aminya
You should use a glob library. Do not use regexp to match paths. Something like https://github.com/gobwas/glob
Done.
Although this is speaking to my ignorance of gitignore and its patterns, and one of the reasons I have put this off so long. Is the full path needed to be passed into the patterns? How does this work for nested gitignore files?
No worries. I should have added the glob integration to eliminate the confusion.
Glob patterns can be summed up together. So, the approach for supporting nested git ignore files is to:
- Find all the gitignore files using a glob pattern like
**/.gitignore - Pass their paths to
globify_gitignoreso it reads and parses each. - Concatenate all the arrays returned by
globify_gitignore - Pass the big array of glob patterns to the glob package to get a whitelist of the paths that should be analyzed.
If you did this let me know, and I will be happy to merge it to globify_gitignore.
Is there any progress on this, or a workaround to reliably ignore a list of folders?
No progress yet sorry. Been too busy on work projects and have not had time outside of work to actually put time on this. It is on my list of things to do.
If you want to ignore folders, you can however put .ignore files where the folder is with the folder you want to ignore in it. If you don't use globbing rules this should work as is.
Relates to https://github.com/boyter/scc/issues/261
Move this to Release 3.3.0?
My bad. I really need to get back on this one. The annoying this being the amount of logic around this, meaning its going to need a fair bit of effort to ensure it works.