scc icon indicating copy to clipboard operation
scc copied to clipboard

Gitignore folders are not supported

Open aminya opened this issue 4 years ago • 20 comments

Describe the bug If you have something like this in the gitignore:

folder1/folder2/

scc still counts the files under folder1/folder2.

To Reproduce

  1. Add some files under folder1/folder2, and run scs

Expected behavior It should support folders

Desktop (please complete the following information):

  • OS: [e.g. macOS, Windows, Linux] Windows 10
  • Version [e.g. 22] 3.0.0

aminya avatar Oct 05 '21 00:10 aminya

Yeah its down to the lack of a reliable .gitignore parse for Go.

I am aware and it's being worked on.

boyter avatar Oct 05 '21 04:10 boyter

Some time ago, I wrote a Gitignore to Glob converter in TypeScript. If there is no such library in Go, I can try porting it. I exactly followed the GitIgnore specification https://github.com/aminya/globify-gitignore

aminya avatar Oct 05 '21 04:10 aminya

If you do @aminya ill be happy to use it.

I started implementing one on this branch https://github.com/boyter/scc/tree/issue178 where I wanted to add ** support in, but ran into issues and moved on to some other issues.

Having this as an external library that can be used by other projects would be extremely helpful! I have a few I would have use it instantly!

boyter avatar Oct 05 '21 21:10 boyter

I have started porting it https://github.com/aminya/globify_gitignore. This is my first Go package, so any help is appreciated.

aminya avatar Oct 06 '21 09:10 aminya

@boyter It is done! Could you test it and see if it works as expected: https://github.com/aminya/globify_gitignore

aminya avatar Oct 06 '21 12:10 aminya

Nice. Looking into it now. Will have some feedback in a few days.

boyter avatar Oct 06 '21 22:10 boyter

Have not forgotten this, but got distracted looking into alternative file walk Go implementations. I might try applying this on one of my other projects first though, probably https://github.com/boyter/dcd since it should be easier to implement there.

boyter avatar Oct 21 '21 21:10 boyter

By the way, let me know if we should add a function that directly gives the list of files instead of just the glob pattern.

aminya avatar Oct 23 '21 15:10 aminya

Sample test cases to use with this https://github.com/svent/gitignore-test

boyter avatar Dec 15 '21 23:12 boyter

I have written enough tests to ensure the functionality of the library. But we can always add more. The repo you sent takes an interesting approach to test this. It requires some work for integration.

aminya avatar Dec 15 '21 23:12 aminya

@aminya I tend to use a mix of unit and integration tests to ensure everything works as expected, and I can use it to compare against other tools such as rg, ag and git itself.

I have started here with the integration work, https://github.com/boyter/goignore which I would love you to have a look over as I am 100% certain I am doing something wrong. However the results are starting well with,

$ goignore
>>> .gitignore
2 Documentation/test.a.html foo: FAIL
2 Documentation/foo.html foo: OK
2 Documentation/gitignore.html foo: FAIL
2 Documentation/foo-excl.html foo: FAIL
2 foodir/bar/testfile foo: FAIL
2 dirpattern foo: OK
2 exclude/dir1/dir2/dir3/testfile foo: FAIL
2 file.o foo: FAIL
2 rootsubdir/foo foo: FAIL
2 subdir/subdir2/bar foo: FAIL
2 subdir/rootsubdir/foo foo: OK
2 subdir/hide/foo foo: FAIL
2 subdir/logdir/log/foo.log foo: FAIL
2 subdir/logdir/log/test.log foo: FAIL
2 subdir/logdir/log/findthis.log foo: FAIL
2 lib.a foo: FAIL
>>> .gitignore
5 git-sample-3/test foo: FAIL
5 git-sample-3/foo/test foo: FAIL
5 git-sample-3/foo/bar foo: OK
>>> .gitignore
2 htmldoc/jslib.min.js foo: FAIL
2 htmldoc/docs.html foo: OK
2 arch/foo/vmlinux.lds.S foo: FAIL
>>> .gitignore
3 arch/foo/kernel/vmlinux.lds.S foo: OK
2 log/foo.log foo: OK
2 log/test.log foo: FAIL
2 !important!.txt foo: FAIL
2 bar/testfile foo: OK
2 src/internal.o foo: FAIL
2 src/findthis.o foo: OK

compared to a test case,

$ rg ^foo: | sort
Documentation/foo.html:foo: OK
arch/foo/kernel/vmlinux.lds.S:foo: OK
bar/testfile:foo: OK
dirpattern:foo: OK
git-sample-3/foo/bar:foo: OK
htmldoc/docs.html:foo: OK
log/foo.log:foo: OK
src/findthis.o:foo: OK
subdir/rootsubdir/foo:foo: OK

$ git grep ^foo: | sort
Documentation/foo.html:foo: OK
arch/foo/kernel/vmlinux.lds.S:foo: OK
bar/testfile:foo: OK
dirpattern:foo: OK
git-sample-3/foo/bar:foo: OK
htmldoc/docs.html:foo: OK
log/foo.log:foo: OK
src/findthis.o:foo: OK
subdir/rootsubdir/foo:foo: OK

Which is not a bad start. Note that this is using the new Go file walk methods as well. The fail ones are the main ones I need to focus on.

If wondering, the reason I copied the code in there was it helps me with debugging when I want to look though things and potentially add a quick println.

boyter avatar Dec 15 '21 23:12 boyter

Actually I note I get a lot of the following error (which is to be expected)

error parsing regexp: invalid nested repetition operator: `**`

As there are a lot of double ** in there. This is another bug I have been tracking https://github.com/boyter/scc/issues/178 which needs to be resolved. I suspect I am using your library incorrectly @aminya

boyter avatar Dec 16 '21 00:12 boyter

You should use a glob library. Do not use regexp to match paths. Something like https://github.com/gobwas/glob

aminya avatar Dec 16 '21 00:12 aminya

Done.

Although this is speaking to my ignorance of gitignore and its patterns, and one of the reasons I have put this off so long. Is the full path needed to be passed into the patterns? How does this work for nested gitignore files?

boyter avatar Dec 16 '21 00:12 boyter

No worries. I should have added the glob integration to eliminate the confusion.

Glob patterns can be summed up together. So, the approach for supporting nested git ignore files is to:

  1. Find all the gitignore files using a glob pattern like **/.gitignore
  2. Pass their paths to globify_gitignore so it reads and parses each.
  3. Concatenate all the arrays returned by globify_gitignore
  4. Pass the big array of glob patterns to the glob package to get a whitelist of the paths that should be analyzed.

If you did this let me know, and I will be happy to merge it to globify_gitignore.

aminya avatar Dec 16 '21 08:12 aminya

Is there any progress on this, or a workaround to reliably ignore a list of folders?

theodorejb avatar Jun 29 '22 22:06 theodorejb

No progress yet sorry. Been too busy on work projects and have not had time outside of work to actually put time on this. It is on my list of things to do.

If you want to ignore folders, you can however put .ignore files where the folder is with the folder you want to ignore in it. If you don't use globbing rules this should work as is.

boyter avatar Jul 05 '22 01:07 boyter

Relates to https://github.com/boyter/scc/issues/261

boyter avatar Jul 28 '22 07:07 boyter

Move this to Release 3.3.0?

CarterLi avatar Mar 20 '24 02:03 CarterLi

My bad. I really need to get back on this one. The annoying this being the amount of logic around this, meaning its going to need a fair bit of effort to ensure it works.

boyter avatar Mar 20 '24 21:03 boyter